---
title: "Pollution Outsourcing and Democracy 2"
author: "Ella Henninger"
date: "2023-07-25"
output:
  word_document: default
  html_document: default
---

# Contents of the replication packet

This R-Markdown file replicates all analyses and figures included in the paper "What Makes Democracies Greener? The Role of Pollution Offshoring" by Tobias Böhmelt, Ella Henninger, and Thomas Bernauer.

## Data
 - **Outsourcing data**: Presberger and Bernauer (2023), https://www.sciencedirect.com/science/article/pii/S0959378023000031?via%3Dihub
 - **Pollution data**: Van Donkelaar et al. (2021), https://pubs.acs.org/doi/full/10.1021/acs.est.1c05309
 - **Main data**: Quality of Governance database (QoG), https://www.gu.se/en/quality-government/qog-data
 - **Economic data**: World Development Indicators (2023), https://databank.worldbank.org/source/world-development-indicators#
 - **Democracy data**: V-Dem (2023), https://www.v-dem.net/data/the-v-dem-dataset/


**The code is structured as follows**:

1. Set up

2. Preparatory steps
2.1 Load data
2.2 Create lagged and logged variables

3. Descriptives
3.1 Table summary statistics
3.2 Correlations
3.3 Trends

4. Analyses
4.1 Models democracy and pollution outsourcing: Model group 1
4.2 Models pollution outsourcing and environmental performance: Model group 2
4.3 Plots model group 1
4.4 Plots model group 2

5. Quantities of interest
5.1 Simulated interaction effects
5.2 Marginal effects plots
5.3 Expected values: Model group 1 (within)
5.4 Expected values: Model group 1 (Between)
5.5 Expected values & first differences: Model group 2 (within)
5.6 Expected values & first differences: Model group 2 (between)

6. Robustness checks
6.1 Replacing DV with PM2.5
6.2 Replacing DV with carbon footprint
6.3 Different outsourcing variables
6.4 Alternative operationalization outsourcing
6.5 Alternative operationalization democracy
6.6 Additional controls
6.7 Drop high-income states
6.8 Sample split instead of interaction term

**As a note**: You may be asked to update some packages, please press "yes" to make sure the file runs smoothly.


# Set up

```{r setup, include=FALSE}

## clean environment
rm(list = ls())

# Define which packages needed for analyses
p_needed <-
  c("knitr",
    "dplyr", 
    "psych",
    "stargazer",
    "ggplot2",
    "viridis",
    "haven",
    "MASS",
    "modelsummary",
    "WDI",
    "plm",
    "lmtest",
    "ggmap",
    "maps",
    "tidyr",
    "devtools",
    "data.table",
    "sjPlot",
    "interplot",
    "fastDummies",
    "texreg",
    "interflex",
    "reshape2",
    "marginaleffects",
    "ggExtra",
    "cowplot")

# Check which packages are already installed on your computer
packages <- rownames(installed.packages())

# Check which packages are not installed
p_to_install <- p_needed[!(p_needed %in% packages)]

if (length(p_to_install) > 0) {
  install.packages(p_to_install)
}
sapply(p_needed, require, character.only = TRUE)


# Set an option for the final document that can be produced from the .Rmd file.
knitr::opts_chunk$set(echo = TRUE)

## For replicability: session information 
session_info <- print(sessionInfo())

```



# Preparatory steps

## Load data

```{r load data}

#-------------------------------------------------------------------------------
# Data
#-------------------------------------------------------------------------------
data <- read_dta("data/QoG/qog_std_ts_jan23_stata14.dta")

#-------------------------------------------------------------------------------
# Filter data
#-------------------------------------------------------------------------------
data <- data[which(data$year >= 1989),]
data <- data[which(data$year <= 2022),]

```

## Create lagged and logged variables 

```{r variable selection}

#-------------------------------------------------------------------------------
# Create logs
#-------------------------------------------------------------------------------
data$insourced_GHG <- data$cf_cc_pop_mean
data$log_wdi_co2 <- log(1 + data$wdi_co2)  ## plus 1 to avoid negative numbers

## outsourced CO2
data$log_cc_outs <- log(data$cf_cc_pop_mean_outs)
data$log_cc_sum_outs <- log(data$cf_cc_pop_sum_outs + 1)

## other outsourcing variabes
data$log_lu_outs <- log(data$cf_lu_pop_mean_outs + 1)
data$log_en_outs <- log(data$cf_en_pop_mean_outs)
data$log_mf_outs <- log(data$cf_mf_pop_mean_outs)
data$log_bw_outs <- log(data$cf_bw_pop_mean_outs)

## pm2.5
data$log_pm25_pop <- log(data$populationweightedpm25ugm3)
data$log_pm25_geo <- log(data$geographicmeanpm25ugm3)

## carbon footprint of consumption
data$log_ef_carb <- log(data$ef_carb)

## population
data$log_pop <- log(data$wdi_pop)
data$log_pop_den <- log(data$wdi_popden)

## gdp
data$log_gdp_pc <- log(data$wdi_gdpcapcon2015)
data$log_gdp_pc_sq <- data$log_gdp_pc^2
data$log_gdp <- log(data$wdi_gdppppcon2017)


#-------------------------------------------------------------------------------
# Create environmental treaties variable
#-------------------------------------------------------------------------------
data <- data %>% 
  group_by(ccode) %>% 
  mutate(env_treaties = ifelse(any(!is.na(iea_sum_i)), zoo::na.locf(iea_sum_i), NA))

data$env_treaties <- data$env_treaties/100

#-------------------------------------------------------------------------------
# Transform Freedom House civil liberties variable
#-------------------------------------------------------------------------------
data$fh_cl <- 8 - data$fh_cl

#-------------------------------------------------------------------------------
# Rescale political globalization variable (1-10)
#-------------------------------------------------------------------------------
data$dr_pg <- data$dr_pg/100

#-------------------------------------------------------------------------------
# Subset data to relevant variables
#-------------------------------------------------------------------------------
data_s <- dplyr::select(data, "cname", "ccode", "year", "ccodealp", 
                              "insourced_GHG", "cf_cc_pop_mean", "cf_cc_pop_mean_outs", "cf_cc_pop_sum_outs", "log_cc_outs",
                              "log_cc_sum_outs",
                              "log_lu_outs", "log_mf_outs", "log_bw_outs", "log_en_outs",
                              "log_pm25_pop", "log_pm25_geo", "log_wdi_co2", "wdi_co2",
                              "populationweightedpm25ugm3", "geographicmeanpm25ugm3",
                              "ef_carb", "log_ef_carb",
                              "vdem_polyarchy", "log_gdp_pc", "log_gdp_pc_sq", "log_gdp",
                              "log_pop", "log_pop_den", "fh_cl", "fh_fog",
                              "cf_bw_pop_mean_outs", "cf_en_pop_mean_outs", "cf_lu_pop_mean_outs", "cf_mf_pop_mean_outs",
                              "dr_pg", "fh_status", "env_treaties",  "wdi_trade",
                              "p_polity2", "vdem_partip", "vdem_delibdem", "wdi_gdpind")

```



# Descriptives

## Table summary statistics

```{r descriptives}

#-------------------------------------------------------------------------------
# Table: Summary statistics
#-------------------------------------------------------------------------------
data_descript <- data_s[which(!is.na(data_s$log_cc_outs) &
                              data_s$log_ef_carb != -Inf),]

# options("modelsummary_format_numeric_latex" = "plain")
# datasummary(formula = (`Pollution Outsourcing` = cf_cc_pop_mean_outs) + (`ln(Pollution Outsourcing)` = log_cc_outs) +
#                       (`ln(CO2)` = log_wdi_co2) +
#                       (`ln(PM2.5) (pop. weighted)` = log_pm25_pop) +
#                       (`ln(Environm. Footprint Consumpt.)` = log_ef_carb) +
#                       (`Democracy` = vdem_polyarchy) + (`ln(GDP)` = log_gdp) +
#                       (`ln(GDP per capita)` = log_gdp_pc) + (`ln(Populat. Size)`= log_pop) +
#                       (`Civil liberties (Freedom House)` = fh_cl) +
#                       (`Democracy (Polity V)` = p_polity2) + (`Political Globaliz.` = dr_pg) +
#                       (`Trade Openness` = wdi_trade) + (`Environm. Treaties` = env_treaties)
#               ~ N + Mean + SD + Min + Median + Max,
#             title = "Summary statistics",
#             data = data_descript,
#             output = "latex")

```

## Correlations

```{r correlations}

#-------------------------------------------------------------------------------
## Correlation heat map
#-------------------------------------------------------------------------------
data_cor <- data_descript[,c("log_cc_outs", "log_pm25_pop", "log_wdi_co2", "log_ef_carb",
                             "vdem_polyarchy", "log_gdp", "log_gdp_pc", "log_pop", 
                             "fh_cl", "p_polity2", "wdi_trade", "env_treaties", "dr_pg", "wdi_gdpind")]

data_cor <- na.omit(data_cor)

cormat <- round(cor(data_cor), 2)

melted_cormat <- melt(cormat)

ggplot(data = melted_cormat, aes(x=Var1, y=Var2, fill=value)) + 
  scale_fill_gradient2(midpoint= 0, 
                       low = viridis(3)[1], 
                       mid = "white",
                       high = viridis(3)[2], 
                       na.value = "grey50", space ="Lab") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
  geom_tile()

```

## Trends

```{r trends}

#-------------------------------------------------------------------------------
## Differentiate inequality increasing / decreasing countries over time
#-------------------------------------------------------------------------------

## create lead variable
data_trend <- data_s %>%
  group_by(ccode) %>%
  arrange(year)%>%
  mutate(lead_cc_outs = dplyr::lead(cf_cc_pop_mean_outs),
         lead_pm25 = dplyr::lead(populationweightedpm25ugm3),
         lead_democracy = dplyr::lead(vdem_polyarchy))    

## name columns
data_trend <- data_trend[,c("ccode", "year", "cf_cc_pop_mean_outs", "lead_cc_outs",
                            "populationweightedpm25ugm3", "lead_pm25", "vdem_polyarchy", "lead_democracy")]

## add inequality trend column
data_trend$trend_outsource <- NA
data_trend$trend_pollution <- NA
data_trend$trend_democracy <- NA

## get variable that indicates increase vs. decrease in t+1 compared to t
for(i in 1:nrow(data_trend)){
data_trend$trend_outsource[i] <- ifelse((!is.na(data_trend$lead_cc_outs[i]) & !is.na(data_trend$cf_cc_pop_mean_outs[i])),     # if both top10 and lead top10 not NA
                                         (data_trend$lead_cc_outs[i] - data_trend$cf_cc_pop_mean_outs[i]), NA)                # insert diff between them 

data_trend$trend_pollution[i] <- ifelse((!is.na(data_trend$lead_pm25[i]) & !is.na(data_trend$populationweightedpm25ugm3[i])),     # if both top10 and lead top10 not NA
                                         (data_trend$lead_pm25[i] - data_trend$populationweightedpm25ugm3[i]), NA)                # insert diff between them 

data_trend$trend_democracy[i] <- ifelse((!is.na(data_trend$lead_democracy[i]) & !is.na(data_trend$vdem_polyarchy[i])),     # if both top10 and lead top10 not NA
                                         (data_trend$lead_democracy[i] - data_trend$vdem_polyarchy[i]), NA)                # insert diff between them 
}

## get overall trend (more increasing vs. more decreasing years)
summary_trend_outsource <- data_trend %>%
  group_by(ccode) %>%
  summarize(sum_trend_outsource = ifelse(all(is.na(trend_outsource)), NA, sum(trend_outsource, na.rm = TRUE)),
            var_trend_outsource = ifelse(all(is.na(trend_outsource)), NA, var(trend_outsource, na.rm = TRUE))) 

summary_trend_pollution <- data_trend %>%
  group_by(ccode) %>%
  summarize(sum_trend_pollution = ifelse(all(is.na(trend_pollution)), NA, sum(trend_pollution, na.rm = TRUE)),
            var_trend_pollution = ifelse(all(is.na(trend_pollution)), NA, var(trend_pollution, na.rm = TRUE))) 

summary_trend_democracy <- data_trend %>%
  group_by(ccode) %>%
  summarize(sum_trend_democracy = ifelse(all(is.na(trend_democracy)), NA, sum(trend_democracy, na.rm = TRUE)),
            var_trend_democracy = ifelse(all(is.na(trend_democracy)), NA, var(trend_democracy, na.rm = TRUE))) 


## colnames
colnames(summary_trend_outsource)[1:3] <- c("ccode", "trend_outsource", "var_trend_outsource")
colnames(summary_trend_pollution)[1:3] <- c("ccode", "trend_pollution", "var_trend_pollution")
colnames(summary_trend_democracy)[1:3] <- c("ccode", "trend_democracy", "var_trend_democracy")

## combine
data_s <- left_join(data_s, summary_trend_outsource, by = "ccode")
data_s <- left_join(data_s, summary_trend_pollution, by = "ccode")
data_s <- left_join(data_s, summary_trend_democracy, by = "ccode")


summary(data_s$trend_outsource)
summary(data_s$trend_pollution)
summary(data_s$trend_democracy)
summary(data_s$var_trend_outsource)
summary(data_s$var_trend_pollution)
summary(data_s$var_trend_democracy)

## clean up
remove(data_trend)


#-------------------------------------------------------------------------------
# Plot trend variable
#-------------------------------------------------------------------------------
min_diff <- round(min(data_s$trend_pollution, na.rm = TRUE), digits = 2)
max_diff <- round(max(data_s$trend_pollution, na.rm = TRUE), digits = 2)

pdf("plots/trend_pollution.pdf")

ggplot(data_s[which(!is.na(data_s$trend_pollution)),],
       aes(x = reorder(cname, trend_pollution), y = trend_pollution, fill = trend_pollution >= 0)) +
  theme_minimal() +
  geom_bar(stat = "identity") +
  labs(y = "Change in Pollution", x = "Countries") +

  scale_fill_manual(name = NULL,
                    breaks = c("TRUE", "FALSE"),
                    labels = c("Increase", "Decrease"),
                    values = c(viridis(3)[2], 
                               viridis(3)[1])) +
  
  theme(axis.text.x = element_blank(), 
        axis.text.y = element_text(size = 3),
        axis.ticks.y = element_blank(),
        axis.ticks.x = element_blank(),
        axis.title = element_text(size = 12),
        legend.position = "bottom", 
        legend.text = element_text(size = 10)) +
  
  # annotate("text", x = 168, y = 3.7, label = "Max (India) \n+0.17", size = 3.2) +
  # annotate("text", x = 10, y = -4, label = "Min (Maldives) \n-0.18", size = 3.2) +
  coord_flip()

dev.off()


#-------------------------------------------------------------------------------
# Plot trend variable
#-------------------------------------------------------------------------------
min_diff <- round(min(data_s$trend_outsource, na.rm = TRUE), digits = 2)
max_diff <- round(max(data_s$trend_outsource, na.rm = TRUE), digits = 2)

pdf("plots/trend_outsource.pdf")

ggplot(data_s[which(!is.na(data_s$trend_outsource)),],
       aes(x = reorder(cname, trend_outsource), y = trend_outsource, fill = trend_outsource >= 0)) +
  theme_minimal() +
  geom_bar(stat = "identity") +
  labs(y = "Change in Outsourcing", x = "Countries") +

  scale_fill_manual(name = NULL,
                    breaks = c("TRUE", "FALSE"),
                    labels = c("Increase", "Decrease"),
                    values = c(viridis(3)[2], 
                               viridis(3)[1])) +
  
  theme(axis.text.x = element_blank(), 
        axis.text.y = element_text(size = 3),
        axis.ticks.y = element_blank(),
        axis.ticks.x = element_blank(),
        axis.title = element_text(size = 12),
        legend.position = "bottom", 
        legend.text = element_text(size = 10)) +
  
  # annotate("text", x = 168, y = 3.7, label = "Max (India) \n+0.17", size = 3.2) +
  # annotate("text", x = 10, y = -4, label = "Min (Maldives) \n-0.18", size = 3.2) +
  coord_flip()

dev.off()


#-------------------------------------------------------------------------------
# Plot trend variable
#-------------------------------------------------------------------------------
min_diff <- round(min(data_s$trend_democracy, na.rm = TRUE), digits = 2)
max_diff <- round(max(data_s$trend_democracy, na.rm = TRUE), digits = 2)

pdf("plots/trend_democracy.pdf")

ggplot(data_s[which(!is.na(data_s$trend_democracy)),],
       aes(x = reorder(cname, trend_democracy), y = trend_democracy, fill = trend_democracy >= 0)) +
  theme_minimal() +
  geom_bar(stat = "identity") +
  labs(y = "Change in Democracy", x = "Countries") +

  scale_fill_manual(name = NULL,
                    breaks = c("TRUE", "FALSE"),
                    labels = c("Increase", "Decrease"),
                    values = c(viridis(3)[2], 
                               viridis(3)[1])) +
  
  theme(axis.text.x = element_blank(), 
        axis.text.y = element_text(size = 3),
        axis.ticks.y = element_blank(),
        axis.ticks.x = element_blank(),
        axis.title = element_text(size = 12),
        legend.position = "bottom", 
        legend.text = element_text(size = 10)) +
  
  # annotate("text", x = 168, y = 3.7, label = "Max (India) \n+0.17", size = 3.2) +
  # annotate("text", x = 10, y = -4, label = "Min (Maldives) \n-0.18", size = 3.2) +
  coord_flip()

dev.off()

```



# Analyses

## Create mean and demeaned variables

```{r variable demeaning}

#-------------------------------------------------------------------------------
# Mean and de-meaning
#-------------------------------------------------------------------------------
data_s <- data_s %>%
  group_by(ccode) %>%
  mutate(
    demeaned_log_co2 = log_wdi_co2 - mean(log_wdi_co2, na.rm = T),
    demeaned_log_pm25 = log_pm25_pop - mean(log_pm25_pop, na.rm = T),
    demeaned_log_cc_outs = log_cc_outs - mean(log_cc_outs, na.rm = T),
    demeaned_log_cc_sum_outs = log_cc_sum_outs - mean(log_cc_sum_outs, na.rm = T),
    demeaned_log_bw_outs = log_bw_outs - mean(log_bw_outs, na.rm = T),
    demeaned_log_en_outs = log_en_outs - mean(log_en_outs, na.rm = T),
    demeaned_log_lu_outs = log_lu_outs - mean(log_lu_outs, na.rm = T),
    demeaned_log_mf_outs = log_mf_outs - mean(log_mf_outs, na.rm = T),
    demeaned_cc_outs  = cf_cc_pop_mean_outs - mean(cf_cc_pop_mean_outs, na.rm = T),
    demeaned_ef_carb = ef_carb - mean(ef_carb, na.rm = T),
    demeaned_env_treat = env_treaties - mean(env_treaties, na.rm = T),
    demeaned_dr_pg = dr_pg - mean(dr_pg, na.rm = T),
    demeaned_trade = wdi_trade - mean(wdi_trade, na.rm = T),
    demeaned_gdp = log_gdp - mean(log_gdp, na.rm = T),
    demeaned_vdem_poly = vdem_polyarchy - mean(vdem_polyarchy, na.rm = T),
    demeaned_fh = fh_cl - mean(fh_cl, na.rm = T),
    demeaned_polity = p_polity2 - mean(p_polity2, na.rm = T),
    demeaned_log_pop = log_pop - mean(log_pop, na.rm = T),
    demeaned_log_gdp_pc = log_gdp_pc - mean(log_gdp_pc, na.rm = T),
    demeaned_log_gdp_pc_sq = log_gdp_pc_sq - mean(log_gdp_pc_sq, na.rm = T),
    mean_cf_log_outs = mean(cf_cc_pop_mean_outs, na.rm = T),
    mean_log_co2 = mean(log_wdi_co2, na.rm = T),
    mean_log_pm25 = mean(log_pm25_pop, na.rm = T),
    mean_log_cc_outs = mean(log_cc_outs, na.rm = T),
    mean_log_cc_sum_outs = mean(log_cc_sum_outs, na.rm = T),
    mean_log_bw_outs = mean(log_bw_outs, na.rm = T),
    mean_log_en_outs = mean(log_en_outs, na.rm = T),
    mean_log_lu_outs = mean(log_lu_outs, na.rm = T),
    mean_log_mf_outs = mean(log_mf_outs, na.rm = T),
    mean_wdi = mean(wdi_co2, na.rm = T),
    mean_cc_outs = mean(cf_cc_pop_mean_outs, na.rm = T),
    mean_ef_carb = mean(ef_carb, na.rm = T),
    mean_env_treat = mean(env_treaties, na.rm = T),
    mean_dr_pg = mean(dr_pg, na.rm = T),
    mean_trade = mean(wdi_trade, na.rm = T),
    mean_gdp = mean(log_gdp, na.rm = T),
    mean_vdem_poly = mean(vdem_polyarchy, na.rm = T),
    mean_fh = mean(fh_cl, na.rm = T),
    mean_polity = mean(p_polity2, na.rm = T),
    mean_log_pop = mean(log_pop, na.rm = T),
    mean_log_gdp_pc = mean(log_gdp_pc, na.rm = T),
    mean_log_gdp_pc_sq = mean(log_gdp_pc_sq, na.rm = T),
    interaction = log_cc_outs * vdem_polyarchy,
    interaction_bw = log_bw_outs * vdem_polyarchy,
    interaction_en = log_en_outs * vdem_polyarchy,
    interaction_lu = log_lu_outs * vdem_polyarchy,
    interaction_mf = log_mf_outs * vdem_polyarchy,
    interaction_fh = log_cc_outs * fh_cl,
    interaction_polity = log_cc_outs * p_polity2,
    interaction_cc_sum = log_cc_sum_outs * vdem_polyarchy,
    interaction_demeaned = interaction - mean(interaction, na.rm = T),
    interaction_bw_demeaned = interaction_bw - mean(interaction_bw, na.rm = T),
    interaction_en_demeaned = interaction_en - mean(interaction_en, na.rm = T),
    interaction_lu_demeaned = interaction_lu - mean(interaction_lu, na.rm = T),
    interaction_mf_demeaned = interaction_mf - mean(interaction_mf, na.rm = T),
    interaction_fh_demeaned = interaction_fh - mean(interaction_fh, na.rm = T),
    interaction_polity_demeaned = interaction_polity - mean(interaction_polity, na.rm = T),
    interaction_cc_sum_demeaned = interaction_cc_sum - mean(interaction_cc_sum, na.rm = T),
    interaction_mean = mean(log_cc_outs, na.rm = T) * mean(vdem_polyarchy, na.rm = T),
    interaction_bw_mean = mean(log_bw_outs, na.rm = T) * mean(vdem_polyarchy, na.rm = T),
    interaction_en_mean = mean(log_en_outs, na.rm = T) * mean(vdem_polyarchy, na.rm = T),
    interaction_lu_mean = mean(log_lu_outs, na.rm = T) * mean(vdem_polyarchy, na.rm = T),
    interaction_mf_mean = mean(log_mf_outs, na.rm = T) * mean(vdem_polyarchy, na.rm = T),
    interaction_fh_mean = mean(log_cc_outs, na.rm = T) * mean(fh_cl, na.rm = T),
    interaction_polity_mean = mean(log_cc_outs, na.rm = T) * mean(p_polity2, na.rm = T),
    interaction_cc_sum_mean = mean(log_cc_sum_outs, na.rm = T) * mean(vdem_polyarchy, na.rm = T)
  )%>%
  ungroup()

```

## Within-Between models: Model group 1

```{r within-between 1}

## Reduce to correct length
data_ova <- data_s[which(!is.na(data_s$demeaned_cc_outs) &
                         !is.na(data_s$demeaned_vdem_poly) &
                         !is.na(data_s$demeaned_log_pop) &
                         !is.na(data_s$demeaned_log_gdp_pc) &
                         !is.na(data_s$log_wdi_co2)),]

#-------------------------------------------------------------------------------
# Model group 1
#-------------------------------------------------------------------------------
m1_wb <- lmer(log_cc_outs ~ demeaned_vdem_poly + mean_vdem_poly + as.factor(year) +
                (1 | ccode), 
              data = data_s)

m2_wb <- lmer(log_cc_outs ~ demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq + 
               mean_log_pop + mean_log_gdp_pc + mean_log_gdp_pc_sq +
               as.factor(year) + (1 | ccode),
              data = data_ova)

m3_wb <- lmer(log_cc_outs ~ demeaned_vdem_poly + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq + 
                demeaned_log_pop + 
                mean_vdem_poly + mean_log_gdp_pc + mean_log_gdp_pc_sq + mean_log_pop + as.factor(year) +
                (1 | ccode), 
              data = data_ova)

m4_wb <- lmer(log_cc_outs ~ demeaned_vdem_poly + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq + 
                demeaned_log_pop + 
                mean_vdem_poly + mean_log_gdp_pc + mean_log_gdp_pc_sq + mean_log_pop + as.factor(year) +
                (1 | ccode),
               data = data_s[which(!is.na(data_s$log_pm25_pop)),])

stargazer(list(m1_wb, m2_wb, m3_wb), type = "text", omit = c("year"))

#-------------------------------------------------------------------------------
## Output
#-------------------------------------------------------------------------------
texreg(list(m1_wb, m2_wb, m3_wb),
       stars = c(0.01, 0.05, 0.1),
       #file = "texreg1.doc",
       custom.header = list("Pollution Outsourcing" = 1:3),
       caption = "Democracy and Pollution Outsourcing",
       label = "tab:table1",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_vdem_poly" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)", 
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)", 
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\xmark", "\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}")))



```

## Plot coefficients: Model group 1

```{r plots coefficients 1}

#-------------------------------------------------------------------------------
# Coefficient plot 
#-------------------------------------------------------------------------------
coefs_within <- as.data.frame(summary(m3_wb)$coefficients[c(2,5,3,4),])
coefs_between <- as.data.frame(summary(m3_wb)$coefficients[c(6,9,7,8),])

# Create data frames for each model
coefs_within$CI_low <- coefs_within[,"Estimate"] - qnorm(0.95)*coefs_within[,"Std. Error"]
coefs_within$CI_high <- coefs_within[,"Estimate"] + qnorm(0.95)*coefs_within[,"Std. Error"]

coefs_between$CI_low <- coefs_between[,"Estimate"] - qnorm(0.95)*coefs_between[,"Std. Error"]
coefs_between$CI_high <- coefs_between[,"Estimate"] + qnorm(0.95)*coefs_between[,"Std. Error"]


pdf("Plots/Coefficients_table1_within_between.pdf", width = 7.5)

par(mar = c(5.1, 10, 2.5, 2.1)) # c(bottom, left, top, right))

plot(1,
     type = "n",
     ylim = c(0,2),
     bty = "n",
     xlab = "Coefficients and 95% confidence intervals",
     xlim = c(-1.75, 2.25),
     ylab = "",
     yaxt = "n",
     cex.lab = 1.1)


# add grid
segments(x0 = c(-1.25,-1,-0.75, -0.5, -0.25, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2), y0 = 0,
         x1 = c(-1.25,-1,-0.75, -0.5, -0.25, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2), y1 = 2,
         col = "lightgrey",
         lty = "solid", lwd = 0.7)

segments(x0 = -1.5, y0 = rev(c(0.25, 0.5, 0.75, 1,
                               1.25, 1.5, 1.75)),
         x1 = 2, y1 = rev(c(0.25, 0.5, 0.75, 1,
                               1.25, 1.5, 1.75)),
         col = "lightgrey",
         lty = "solid", lwd = 0.7)

# add null line
segments(x0 = 0, y0 = 0,
         x1 = 0, y1 = 2.5,
         lty = "solid", lwd = 1, col = "black")


# add point estimates
points(x = coefs_within$Estimate,
       y = rev(c(0.25, 0.75, 1.25, 1.75)),
       pch = 16,
       col = c(viridis(3)[1], viridis(3)[1], viridis(3)[1], viridis(3)[1]),
       cex = 1)

# add CIs
segments(y0 = rev(c(0.25, 0.75, 1.25, 1.75)), 
         y1 = rev(c(0.25, 0.75, 1.25, 1.75)), 
         col = c(viridis(3)[1], viridis(3)[1], viridis(3)[1], viridis(3)[1]),
         x0 = coefs_within$CI_low,
         x1 = coefs_within$CI_high,
         lwd = 3, cex = 1.3, lend = 1)

# add point estimates
points(x = coefs_between$Estimate,
       y = rev(c(0.15, 0.65, 1.15, 1.65)),
       pch = 16,
       col = c(viridis(3)[2], viridis(3)[2], viridis(3)[2], viridis(3, alpha = 0.6)[2]),
       cex = 1)

# add CIs
segments(y0 = rev(c(0.15, 0.65, 1.15, 1.65)), 
         y1 = rev(c(0.15, 0.65, 1.15, 1.65)), 
         col = c(viridis(3)[2], viridis(3)[2], viridis(3)[2], viridis(3, alpha = 0.3)[2]),
         x0 = coefs_between$CI_low,
         x1 = coefs_between$CI_high,
         lwd = 3, cex = 1.3, lend = 1)

# add coefficients text
text(x = -2.75,
     y = rev(c(0.25, 0.75, 1.25, 1.75)),
     labels = c("Democracy", "Population", "GDP per capita", expression("GDP per capita"^2)),
     xpd = T,
     cex = 0.9,
     pos = 4)

# Add a legend
legend("topright", 
       legend = c("Within", "Between"), 
       col = c(viridis(3)[1],viridis(3)[2]), 
       pch = 19,
       bty = "n")

dev.off()

```

## Within-Between models: Model group 2

```{r within-between 2}

#-------------------------------------------------------------------------------
# Model group 2
#-------------------------------------------------------------------------------
m5_wb <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
               mean_log_cc_outs + mean_vdem_poly +
               as.factor(year) + (1 | ccode),
              data = data_ova)

m6_wb <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly +
               demeaned_log_cc_outs * mean_vdem_poly + 
               mean_log_cc_outs + mean_vdem_poly +
               interaction_mean +
               as.factor(year) + (1 | ccode),
              data = data_ova)

m7_wb <- lmer(log_wdi_co2 ~ demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq + 
               mean_log_pop + mean_log_gdp_pc + mean_log_gdp_pc_sq + 
               as.factor(year) + (1 | ccode),
              data = data_ova)

m8_wb <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                demeaned_log_cc_outs * mean_vdem_poly +
                demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                mean_log_cc_outs + mean_vdem_poly +
                mean_log_cc_outs * mean_vdem_poly +
                mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                as.factor(year) + (1 | ccode),
              data = data_ova)

stargazer(m5_wb, m6_wb, m7_wb, m8_wb, type = "text", omit = c("ccode", "year"))

#-------------------------------------------------------------------------------
# Output
#-------------------------------------------------------------------------------
texreg(list(m5_wb, m6_wb, m7_wb, m8_wb),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg2.doc",
       custom.header = list("CO$_2$ emissions" = 1:4),
       caption = "Environmental Performance and Pollution Outsourcing",
       custom.model.names = c("Model 4", "Model 5", "Model 6", "Model 7"),
       label = "tab:table2",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_vdem_poly" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_cc_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "interaction_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "mean_vdem_poly:mean_log_cc_outs" = "Pollution Outsourcing x Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\xmark", "\\xmark", "\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}")))



#-------------------------------------------------------------------------------
# Model group 2 - Interaction - Check
#-------------------------------------------------------------------------------
m8_lm_int <- lm(log_wdi_co2 ~ log_cc_outs + vdem_polyarchy + 
                 log_cc_outs * vdem_polyarchy +
                 log_pop + log_gdp_pc + log_gdp_pc_sq +
                 as.factor(year) + as.factor(ccode),
                data = data_s)

stargazer(m8_wb, m8_lm_int, type = "text", omit = c("ccode", "year"))

```

## Plot coefficients: Model group 2

```{r plot coefficients 2}

#-------------------------------------------------------------------------------
# Coefficient plot - Model group 2
#-------------------------------------------------------------------------------
coefs_within2 <- as.data.frame(summary(m8_wb)$coefficients[c(2,3,37,5,6,7),])
coefs_between2 <- as.data.frame(summary(m8_wb)$coefficients[c(8,4,38,9:11),])

# Create data frames for each model
coefs_within2$CI_low <- coefs_within2[,"Estimate"] - qnorm(0.95)*coefs_within2[,"Std. Error"]
coefs_within2$CI_high <- coefs_within2[,"Estimate"] + qnorm(0.95)*coefs_within2[,"Std. Error"]

coefs_between2$CI_low <- coefs_between2[,"Estimate"] - qnorm(0.95)*coefs_between2[,"Std. Error"]
coefs_between2$CI_high <- coefs_between2[,"Estimate"] + qnorm(0.95)*coefs_between2[,"Std. Error"]


pdf("Plots/Coefficients_table2_within_between.pdf", width = 7.5)

par(mar = c(5.1, 10, 2.5, 2.1)) # c(bottom, left, top, right))

plot(1,
     type = "n",
     ylim = c(0,2.5),
     bty = "n",
     xlab = "Coefficients and 95% confidence intervals",
     xlim = c(-0.95, 1.1),
     ylab = "",
     yaxt = "n",
     cex.lab = 1.1)


# add grid
segments(x0 = c(-1.25,-1,-0.75, -0.5, -0.25, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2), y0 = 0,
         x1 = c(-1.25,-1,-0.75, -0.5, -0.25, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2), y1 = 2.5,
         col = "lightgrey",
         lty = "solid", lwd = 0.7)

segments(x0 = -1.5, y0 = rev(c(0.25, 0.45, 0.65, 0.85, 1.05, 1.25, 1.45, 1.65, 1.85, 2.05, 2.25)),
         x1 = 2, y1 = rev(c(0.25, 0.45, 0.65, 0.85, 1.05, 1.25, 1.45, 1.65, 1.85, 2.05, 2.25)),
         col = "lightgrey",
         lty = "solid", lwd = 0.7)

# add null line
segments(x0 = 0, y0 = 0,
         x1 = 0, y1 = 2.5,
         lty = "solid", lwd = 1, col = "black")


# add point estimates
points(x = coefs_within2$Estimate,
       y = rev(c(0.25, 0.65, 1.05, 1.45, 1.85, 2.25)),
       pch = 16,
       col = c(viridis(3)[1], viridis(3)[1],viridis(3)[1],viridis(3)[1], viridis(3, alpha = 0.6)[1],viridis(3, alpha = 0.6)[1]),
       cex = 1)

# add CIs
segments(y0 = rev(c(0.25, 0.65, 1.05, 1.45, 1.85, 2.25)), 
         y1 = rev(c(0.25, 0.65, 1.05, 1.45, 1.85, 2.25)), 
         col = c(viridis(3)[1], viridis(3)[1], viridis(3)[1],viridis(3)[1], viridis(3, alpha = 0.3)[1],viridis(3, alpha = 0.3)[1]),
         x0 = coefs_within2$CI_low,
         x1 = coefs_within2$CI_high,
         lwd = 3, cex = 1.3, lend = 1)

# add point estimates
points(x = coefs_between2$Estimate,
       y = rev(c(0.15, 0.55, 0.95, 1.35, 1.75, 2.15)),
       pch = 16,
       col = c(viridis(3, alpha = 0.6)[2], viridis(3)[2],viridis(3)[2],viridis(3)[2],
               viridis(3, alpha = 0.6)[2], viridis(3, alpha = 0.6)[2]),
       cex = 1)

# add CIs
segments(y0 = rev(c(0.15, 0.55, 0.95, 1.35, 1.75, 2.15)), 
         y1 = rev(c(0.15, 0.55, 0.95, 1.35, 1.75, 2.15)), 
         col = c(viridis(3, alpha = 0.3)[2], viridis(3)[2],viridis(3)[2],viridis(3)[2],
                 viridis(3, alpha = 0.3)[2], viridis(3, alpha = 0.3)[2]),
         x0 = coefs_between2$CI_low,
         x1 = coefs_between2$CI_high,
         lwd = 3, cex = 1.3, lend = 1)

# add coefficients text
text(x = -1.85,
     y = rev(c(0.25, 0.65, 1.05, 1.35, 1.85, 2.25)),
     labels = c("Pollution outsourcing", "Democracy", "Pollution outsourcing \nx Democracy", 
                "Population", "GDP per capita", expression("GDP per capita"^2)),
     xpd = T,
     cex = 0.9,
     pos = 4)

# Add a legend
legend("topright", 
       legend = c("Within", "Between"), 
       col = c(viridis(3)[1], viridis(3)[2]), 
       pch = 19,
       bty = "n")

dev.off()

```



# Quantities of interest

## Simulated interaction effects

```{r plots simulated interaction effect}

#-------------------------------------------------------------------------------
# Simulate interaction effect 
#-------------------------------------------------------------------------------
nsim <- 1000
beta_hat <- as.vector(summary(m8_wb)$coefficients[c(1:38),1])
V_hat <- vcov(m8_wb)
S <- mvrnorm(nsim, beta_hat, V_hat)
quants_S <- apply(S, 2, quantile, probs = c(0.025, 0.5, 0.975))

## Plot interaction term - WITHIN
pdf("plots/sim_interact_within.pdf")

plot(density(S[,"demeaned_log_cc_outs:mean_vdem_poly"]),
     las = 1,
     lwd = 3,
     xlim = c(-0.2, 0.03),
     col = viridis(3)[1],
     bty = "n",
     cex = 1.1,
     cex.lab = 1.3,
     yaxt = "n",
     ylab = "Density",
     main = "Pollution Outsourcing x Democracy",
     xlab = "Simulated Interaction Effect (Within)")

abline(v = c(quants_S[c(1,3), "demeaned_log_cc_outs:mean_vdem_poly"]), lty = 2, col = "black")
abline(v = c(quants_S[2,"demeaned_log_cc_outs:mean_vdem_poly"]), lty = 1, lwd = 1.5, col = "black")
abline(v = 0, lty = 1, lwd = 1, col = "black")

legend("topleft",
       lty = c("solid", "dashed"),
       col = "black",
       legend = c("Mean", "95% CIs"),
       bty = "n")

dev.off()


## Plot interaction term - BETWEEN
pdf("plots/sim_interact_between.pdf")

plot(density(S[,"mean_vdem_poly:mean_log_cc_outs"]),
     las = 1,
     lwd = 3,
     xlim = c(-0.35, 0.05),
     col = viridis(3)[2],
     bty = "n",
     cex = 1.1,
     cex.lab = 1.3,
     yaxt = "n",
     ylab = "Density",
     main = "Pollution Outsourcing x Democracy",
     xlab = "Simulated Interaction Effect (Between)")

abline(v = c(quants_S[c(1,3), "mean_vdem_poly:mean_log_cc_outs"]), lty = 2, col = "black")
abline(v = c(quants_S[2,"mean_vdem_poly:mean_log_cc_outs"]), lty = 1, lwd = 1.5, col = "black")
abline(v = 0, lty = 1, lwd = 1, col = "black")

legend("topleft",
       lty = c("solid", "dashed"),
       col = "black",
       legend = c("Mean", "95% CIs"),
       bty = "n")

dev.off()

```

## Marginal effect plots

```{r marginal effects}

#-------------------------------------------------------------------------------
# Simulate marginal effects 
#-------------------------------------------------------------------------------
pdf("plots/Marginal_interaction_within.pdf")

interplot(m8_wb,
          var1 = "demeaned_log_cc_outs",
          var2 = "mean_vdem_poly",
          hist = T,
          ci = 0.95,
          sims = 1000,
          ercolor = viridis(3, alpha = 0.75)[1],
          rfill = viridis(3, alpha = 0.05)[1],
          ralpha = 0.25) +
  xlab("Democracy") +
  ylab("Marginal Effect of Outsourcing") +
  theme_bw() +
  geom_hline(yintercept = 0, linetype = "dashed", color = "black") +
  theme(axis.line = element_line(colour = "black"),
        axis.title.x = element_text(size = 15),
        axis.title.y = element_text(size = 15),
        axis.text = element_text(size = 13),
        plot.title = element_text(size = 17, hjust = 0.5),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank())

dev.off()


pdf("plots/Marginal_interaction_between.pdf")

interplot(m8_wb,
          var1 = "mean_log_cc_outs",
          var2 = "mean_vdem_poly",
          hist = T,
          ci = 0.95,
          sims = 1000,
          ercolor = viridis(3, alpha = 0.75)[2],
          rfill = viridis(3, alpha = 0.05)[2],
          ralpha = 0.25) +
  xlab("Democracy") +
  ylab("Marginal Effect of Outsourcing") +
  theme_bw() +
  geom_hline(yintercept = 0, linetype = "dashed", color = "black") +
  theme(axis.line = element_line(colour = "black"),
        axis.title.x = element_text(size = 15),
        axis.title.y = element_text(size = 15),
        axis.text = element_text(size = 13),
        plot.title = element_text(size = 17, hjust = 0.5),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank())

dev.off()


#-------------------------------------------------------------------------------
# Marginal effects with interflex (non-linear)
#-------------------------------------------------------------------------------
# install.packages('devtools', repos = 'http://cran.us.r-project.org') # if not already installed
# devtools::install_github('xuyiqing/interflex')

#-------------------------------------------------------------------------------
# Preparation

ranef_m8 <- as.data.frame(ranef(m8_wb))
extract_ccode <- ranef_m8$grp
extract_year <- c(1990:2015)

## Reduce to correct length
data_ova <- data_s[which(!is.na(data_s$demeaned_cc_outs) &
                         !is.na(data_s$demeaned_vdem_poly) &
                         !is.na(data_s$demeaned_log_pop) &
                         !is.na(data_s$demeaned_log_gdp_pc) &
                         !is.na(data_s$log_wdi_co2)),]

## Create dummy variables for countries and years
data_ova <- dummy_cols(data_ova[which(data_ova$ccode %in% extract_ccode &  
                                      data_ova$year %in% extract_year),],
                       select_columns = c("ccode", "year"))

data_ova$interaction_within <- data_ova$demeaned_log_cc_outs*data_ova$mean_vdem_poly
data_ova$interaction_between <- data_ova$mean_log_cc_outs*data_ova$mean_vdem_poly

data_ova <- as.data.frame(data_ova)

#-------------------------------------------------------------------------------
# WITHIN INTERACTION

## Raw plot (within interaction)
out1_raw <- interflex(Y = "log_wdi_co2", 
                      D = "demeaned_log_cc_outs", 
                      X = "mean_vdem_poly", 
                      treat.type = "continuous",
                      Z = c("demeaned_vdem_poly", "demeaned_log_pop", "demeaned_log_gdp_pc",
                            "demeaned_log_gdp_pc_sq", "mean_log_pop", "mean_log_gdp_pc",
                            "mean_log_gdp_pc_sq", "interaction_between"),
                      data = data_ova, 
                      FE = c("year", "ccode"),
                      na.rm = T,
                      estimator = "raw", 
                      vcov.type = "robust", 
                      main = "Marginal Effects", 
                      ylim = c(-15, 15))

## Plot raw
plot(out1_raw)

## Marginal effects plot (within interaction)
out1 <- interflex(Y = "log_wdi_co2", 
                  D = "demeaned_log_cc_outs", 
                  X = "mean_vdem_poly", 
                  treat.type = "continuous",
                  Z = c("demeaned_vdem_poly", "demeaned_log_pop", "demeaned_log_gdp_pc",
                       "demeaned_log_gdp_pc_sq", "mean_log_pop", "mean_log_gdp_pc",
                       "mean_log_gdp_pc_sq", "interaction_between"),
                  data = data_ova, 
                  FE = c("year", "ccode"),
                  na.rm = T,
                  estimator = "binning", 
                  vcov.type = "robust", 
                  main = "Marginal Effects", 
                  ylim = c(-15, 15))


pdf("plots/Marginal_interflex_within.pdf", width = 12)

plot(out1,
     pool = T,
     xlab = "Democracy",
     ylab = "Marginal Effect of Pollution Outsourcing",
     Dlabel = "",
     theme.bw = TRUE, show.grid = FALSE,
     cex.axis = 0.8, cex.lab = 0.8, color = viridis(3)[1], legend.title = "")

dev.off()


#-------------------------------------------------------------------------------
# BETWEEN INTERACTION (where is the uncertainty??)

data_s$year2 <- as.factor(data_s$year)

m8_wb_exp <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                    demeaned_log_cc_outs * mean_vdem_poly +
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    mean_log_cc_outs + mean_vdem_poly +
                    mean_log_cc_outs * mean_vdem_poly +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    year2 + (1 | ccode),
                   data = data_s)

library(margins)

# margins package to compute marginal effects (zero for reference)


library(interactions)
slopes <- sim_slopes(m8_wb_exp, pred = mean_log_cc_outs, modx = mean_vdem_poly, 
                     modx.values = c(seq(min(data_s$vdem_polyarchy, na.rm = T), 
                                         max(data_s$vdem_polyarchy, na.rm = T), 
                                         length.out = 10)))
print(slopes)

plot(slopes$slopes$`Value of mean_vdem_poly`,
     slopes$slopes$Est.,
     ylim = c(-0.25, 0.15),
     pch = 19,
     bty = "n",
     ylab = "Marginal Effect Outsourcing",
     xlab = "Democracy")
abline(h = 0)
segments(x0 = slopes$slopes$`Value of mean_vdem_poly`,
         x1 = slopes$slopes$`Value of mean_vdem_poly`,
         y0 = slopes$slopes$`2.5%`,
         y1 = slopes$slopes$`97.5%`,
         col = adjustcolor("black", alpha = 0.3),
         lwd = 3, lend = 1)



## Raw plot (between interaction)
out2_raw <- interflex(Y = "log_wdi_co2", 
                      D = "mean_log_cc_outs", 
                      X = "mean_vdem_poly", 
                      treat.type = "continuous",
                      Z = c("mean_log_pop", "mean_log_gdp_pc",
                            "mean_log_gdp_pc_sq", "demeaned_vdem_poly", 
                            "demeaned_log_pop", "demeaned_log_gdp_pc",
                            "demeaned_log_gdp_pc_sq", "interaction_within"),
                      data = data_ova, 
                      FE = c("year"),
                      na.rm = T,
                      estimator = "raw", 
                      vcov.type = "cluster", 
                      cl = "ccode",
                      main = "Marginal Effects", 
                      ylim = c(-15, 15))

## Plot raw
plot(out2_raw)

## Marginal effects plot (between interaction)
out2 <- interflex(Y = "log_wdi_co2", 
                  D = "mean_log_cc_outs", 
                  X = "mean_vdem_poly", 
                  treat.type = "continuous",
                  Z = c("mean_log_pop", "mean_log_gdp_pc",
                        "mean_log_gdp_pc_sq", "demeaned_log_cc_outs", 
                        "demeaned_vdem_poly", "demeaned_log_pop", "demeaned_log_gdp_pc",
                        "demeaned_log_gdp_pc_sq", "interaction_within"),
                  data = data_ova, 
                  FE = c("year"),
                  na.rm = T,
                  estimator = "binning", 
                  vcov.type = "cluster", 
                  cl = "ccode",
                  main = "Marginal Effects", 
                  ylim = c(-15, 15))

pdf("plots/Marginal_interflex_between.pdf", width = 12)

plot(out2,
     pool = T,
     xlab = "Democracy",
     ylab = "Marginal Effect of Pollution Outsourcing",
     theme.bw = TRUE, show.grid = FALSE,
     cex.axis = 0.8, cex.lab = 0.8, color = viridis(3)[2], legend.title = "")

dev.off()

```

## Simulate expected values: Model group 1 (Within)

```{r expected values 1 within}

#-------------------------------------------------------------------------------
# Preparation
#-------------------------------------------------------------------------------
ranef_m3 <- as.data.frame(ranef(m3_wb))
extract_ccode <- ranef_m3$grp
extract_year <- c(1990:2015)

## Create dummy variables for countries and years
data_ova <- dummy_cols(data_ova[which(data_ova$ccode %in% extract_ccode &  
                                      data_ova$year %in% extract_year),],
                       select_columns = c("ccode", "year"))

summary(data_ova$log_cc_outs)
#-------------------------------------------------------------------------------
# Get X for observed value approach
#-------------------------------------------------------------------------------

# make sure I will select the right columns for the country dummies
first_country <- which(colnames(data_ova) == "ccode_8") # or 4?
last_country <- which(colnames(data_ova) == "ccode_894")

# make sure I will select the right columns for the year dummies
first_year <- which(colnames(data_ova) == "year_1991") # or 1990?
last_year <- which(colnames(data_ova) == "year_2015")

# Get X
length(fixef(m3_wb))
X1 <- as.matrix(cbind(1,
                      data_ova$demeaned_vdem_poly,
                      data_ova$demeaned_log_gdp_pc, 
                      data_ova$demeaned_log_gdp_pc_sq,
                      data_ova$demeaned_log_pop,
                      data_ova$mean_vdem_poly,
                      data_ova$mean_log_gdp_pc,
                      data_ova$mean_log_gdp_pc_sq,
                      data_ova$mean_log_pop,
                      data_ova[,c(first_year:last_year)]))

X <- na.omit(X1)
dim(X1)

## Get colnames
colnames(X1) <- names(fixef(m3_wb))


#-------------------------------------------------------------------------------
# Install simulation github Rittmann et al. (2023)
#-------------------------------------------------------------------------------
install_github("mneunhoe/simloglm")
library (simloglm)

## Function to sample from inverse gamma distribution
rinvgamma <- function(n, shape, rate = 1, scale = 1/rate){
  if(missing(rate) && !missing(scale))
    rate <- 1/scale
  1/stats::rgamma(n, shape, rate)
}


# Set up informal posterior of coefficients
nsim <- 1000 # number of draws

beta_hat1 <- c()

for(i in 1:length(fixef(m3_wb))){
  beta_hat1[i] <- c(fixef(m3_wb)[[i]])
}

## Get colnames
names(beta_hat1) <- names(fixef(m3_wb))

sigma_hat1 <- summary(m3_wb)$sigma
X_prime_X1 <- as.matrix(vcov(m3_wb)) / summary(m3_wb)$sigma^2 ## equivalent to vcov$unscaled

# First sigma^2
set.seed(199610)
sigma2_tilde1 <- rinvgamma(nsim,
                           shape = df.residual(m3_wb)/2,
                           rate = (sigma_hat1^2*df.residual(m3_wb))/2)

# Now the betas
beta_tilde1 <- matrix(NA,
                      nrow = nsim,
                      ncol = length(beta_hat1))

for(sim in 1:nsim){
  beta_tilde1[sim, ] <-
    MASS::mvrnorm(1, beta_hat1, X_prime_X1 * sigma2_tilde1[sim])
}



#-------------------------------------------------------------------------------
# Set scenarios: over range of democracy 
#-------------------------------------------------------------------------------

#-------------------------------------------------------------------------------
# Determine "reasonable" variation of democracy variable (within)
range_dem <- aggregate(data_ova$demeaned_vdem_poly,
                       by = list(data_ova$cname),
                       FUN = range)

names(range_dem) <- c("country", "range")

# get the difference
range_dem$diff <- range_dem$range[,2]- range_dem$range[,1]

# get the respective country
max_range_dem_country <- range_dem$country[which.max(range_dem$diff)]

# and the respective values (min and max > used for setting the scenario) 
min_dem <- range_dem$range[,1][which.max(range_dem$diff)]
max_dem <- range_dem$range[,2][which.max(range_dem$diff)]

# center around 0 (mean of demeaned_poly distribution) by adding midpoint of range to min and max
min_dem_w <- min_dem_w - ((min_dem + max_dem)/2)
max_dem_w <- max_dem_w - ((min_dem + max_dem)/2)

# Within range
range_dem_w <- as.vector(seq(min_dem_w, max_dem_w, length.out = 30))


#-------------------------------------------------------------------------------
# WITHIN EFFECT
#-------------------------------------------------------------------------------

# Create empty matrix to store the scenarios
cases1w <- array(NA, 
                 c(dim(X1), length(range_dem_w)))

# Copy in our X
cases1w[,,] <- X1 # because OVA not Average case!

# Select: Democracy over range
select <- which(colnames(X1) == "demeaned_vdem_poly")

for(i in 1:length(range_dem_w)){
  cases1w[ ,select, i] <- range_dem_w[i]
}

head(cases1w)
dim(cases1w) 

# check dimensions 

# Calculate linear predictor on log scale & retransform
E_Y_c_dem_1w <- matrix(NA, nrow = nsim, ncol = length(range_dem_w)) # empty matrix for storage

for(i in 1:length(range_dem_w)){             # 30 rounds (for range of dem variable)
   ev <- beta_tilde1 %*% t(cases1w[,,i])     # calculate linear predictor on log scale
   ev_plus_gamma <- ev + 1/2*sigma2_tilde1   # add draws of 1/2*sigma2_tilde
   ev_transf <- exp(ev_plus_gamma)           # transform
   ev_transf_m <- ev_transf #- 1             # E(y) 
   tmp_val <- apply(ev_transf_m, 1, mean)    # now average results
   E_Y_c_dem_1w[,i] <- tmp_val               # store in val object
}

# Summarize to get CIs
CI_E_Y_c_dem_1w <- apply(E_Y_c_dem_1w, 2, quantile, c(0.025, 0.975))

# Now get the point estimates 
E_Y_c_hat_dem_1w <- matrix(NA, nrow = length(range_dem_w), ncol = 1)

for(i in 1:length(range_dem_w)){                       # 30 rounds (for range of dem variable)
  ev_point <- beta_hat1 %*% t(cases1w[,,i])            # use beta_hat (not simulated)
  ev_point_plus_gamma <- ev_point + 1/2*sigma_hat1^2   # use sigma_hat (not simulated)
  ev_point_transf <- exp(ev_point_plus_gamma)          # transform
  ev_point_transf_m <- ev_point_transf #- 1            # E(y) 
  tmp_val_point <- apply(ev_point_transf_m, 1, mean)   # now average results
  E_Y_c_hat_dem_1w[i,] <- tmp_val_point                # store in val object
}


#-------------------------------------------------------------------------------
# Plot: Expected values of pollution outsourcing (on original scale)
#-------------------------------------------------------------------------------

pdf("plots/Expected value_m1_within.pdf") 

par(mar = c(5.1, 4.7, 4.1, 2.1)) # c(bottom, left, top, right))

plot(range_dem_w,
     E_Y_c_hat_dem_1w,
     type = "n",
     ylim = c(30, 70),
     xlim = c(min_dem_w, max_dem_w),
     ylab = expression("Pollution Outsourcing [Gg]"),
     xlab = "Democracy",
     bty = "n",
     main = NULL,
     cex.lab = 1.3)

segments(x0 = range_dem_w, x1 = range_dem_w,
         y1 = CI_E_Y_c_dem_1w[2,], y0 = CI_E_Y_c_dem_1w[1,],
         col = viridis(3, 0.25)[1],
         lwd = 4, lend = 1)

points(range_dem_w, E_Y_c_hat_dem_1w, col = viridis(3, 0.8)[1], pch = 20,
       cex = 1.5)


# Add a "histogram" of actual X1-values.
axis(1,
     at = data_ova$demeaned_vdem_poly,
     col.ticks = "gray30",
     labels = FALSE,
     tck = 0.02)

dev.off()


#-------------------------------------------------------------------------------
# First difference (WITHIN)
#-------------------------------------------------------------------------------

## First difference: Democracy high - low
FD_dem_hat_1w <- E_Y_c_hat_dem_1w[30,] - E_Y_c_hat_dem_1w[1,]  ## max - min
FD_dem_1w <- E_Y_c_dem_1w[,30] - E_Y_c_dem_1w[,1] 
CI_FD_dem_1w <- quantile(FD_dem_1w, c(0.025, 0.975))


```

## Simulate expected values: Model group 1 (Between)

```{r expected values 1 between}

#-------------------------------------------------------------------------------
# BETWEEN EFFECT
#-------------------------------------------------------------------------------

# Range democracy (between)
range_dem_b <- as.vector(seq(min(data_ova$mean_vdem_poly, na.rm = T), max(data_ova$mean_vdem_poly, na.rm = T), length.out = 30))

# Create empty matrix to store the scenarios
cases1b <- array(NA, 
                 c(dim(X1), length(range_dem_b)))

# Copy in our X
cases1b[,,] <- X1 # because OVA not Average case!

# Select: Democracy over range
select <- which(colnames(X1) == "demeaned_vdem_poly")

for(i in 1:length(range_dem_b)){
  cases1b[ ,select, i] <- range_dem_b[i]
}

head(cases1b)
dim(cases1b) 

# check dimensions 

# Calculate linear predictor on log scale & retransform
E_Y_c_dem_1b <- matrix(NA, nrow = nsim, ncol = length(range_dem_b)) # empty matrix for storage

for(i in 1:length(range_dem_b)){             # 30 rounds (for range of dem variable)
   ev <- beta_tilde1 %*% t(cases1b[,,i])     # calculate linear predictor on log scale
   ev_plus_gamma <- ev + 1/2*sigma2_tilde1   # add draws of 1/2*sigma2_tilde
   ev_transf <- exp(ev_plus_gamma)           # transform
   ev_transf_m <- ev_transf #- 1             # E(y) 
   tmp_val <- apply(ev_transf_m, 1, mean)    # now average results
   E_Y_c_dem_1b[,i] <- tmp_val               # store in val object
}

# Summarize to get CIs
CI_E_Y_c_dem_1b <- apply(E_Y_c_dem_1b, 2, quantile, c(0.025, 0.975))

# Now get the point estimates 
E_Y_c_hat_dem_1b <- matrix(NA, nrow = length(range_dem_b), ncol = 1)

for(i in 1:length(range_dem_b)){                       # 30 rounds (for range of dem variable)
  ev_point <- beta_hat1 %*% t(cases1b[,,i])            # use beta_hat (not simulated)
  ev_point_plus_gamma <- ev_point + 1/2*sigma_hat1^2   # use sigma_hat (not simulated)
  ev_point_transf <- exp(ev_point_plus_gamma)          # transform
  ev_point_transf_m <- ev_point_transf #- 1            # E(y) 
  tmp_val_point <- apply(ev_point_transf_m, 1, mean)   # now average results
  E_Y_c_hat_dem_1b[i,] <- tmp_val_point                # store in val object
}


#-------------------------------------------------------------------------------
# Plot: Expected values of pollution outsourcing (on original scale)
#-------------------------------------------------------------------------------

pdf("plots/Expected value_m1_between.pdf") 

par(mar = c(5.1, 4.7, 4.1, 2.1)) # c(bottom, left, top, right))

plot(range_dem_b,
     E_Y_c_hat_dem_1b,
     type = "n",
     ylim = c(30, 80),
     xlim = c(0,1),
     ylab = expression("Pollution Outsourcing [Gg]"),
     xlab = "Democracy",
     bty = "n",
     main = NULL,
     cex.lab = 1.3)

segments(x0 = range_dem_b, x1 = range_dem_b,
         y1 = CI_E_Y_c_dem_1b[2,], y0 = CI_E_Y_c_dem_1b[1,],
         col = viridis(3, 0.25)[2],
         lwd = 4, lend = 1)

points(range_dem_b, E_Y_c_hat_dem_1b, col = viridis(3, 0.8)[2], pch = 20,
       cex = 1.5)


# Add a "histogram" of actual X1-values.
axis(1,
     at = data_ova$mean_vdem_poly,
     col.ticks = "gray30",
     labels = FALSE,
     tck = 0.02)

dev.off()


#-------------------------------------------------------------------------------
# First difference (BETWEEN)
#-------------------------------------------------------------------------------

## First difference: Democracy high - low
FD_dem_hat_1b <- E_Y_c_hat_dem_1b[30,] - E_Y_c_hat_dem_1b[1,]  ## max - min
FD_dem_1b <- E_Y_c_dem_1b[,30] - E_Y_c_dem_1b[,1] 
CI_FD_dem_1b <- quantile(FD_dem_1b, c(0.025, 0.975))



#-------------------------------------------------------------------------------
## How "big" is the difference compared to variation in the dataset?
#-------------------------------------------------------------------------------
sd_outs <- aggregate(data_ova$cf_cc_pop_mean_outs,
                     by = list(data_ova$cname),
                     FUN = sd)

mean_within_country_sd <- mean(sd_outs[,2], na.rm = T)
mean_between_country_sd <- sd(data_ova$mean_cc_outs, na.rm = T)
share_FD_var_w <- FD_dem_hat_1w / mean_within_country_sd
share_FD_var_b <- FD_dem_hat_1b / mean_between_country_sd

```

## Simulate expected values and first differences: Model group 2 (Within)

```{r expected values 2 within}

#-------------------------------------------------------------------------------
# Get X for observed value approach
#-------------------------------------------------------------------------------

# make sure I will select the right columns for the country dummies
first_country <- which(colnames(data_ova) == "ccode_8") # or 4?
last_country <- which(colnames(data_ova) == "ccode_894")

# make sure I will select the right columns for the year dummies
first_year <- which(colnames(data_ova) == "year_1991") # or 1990?
last_year <- which(colnames(data_ova) == "year_2015")

# Get X
fixef(m8_wb)
X <- as.matrix(cbind(1,
                       data_ova$demeaned_log_cc_outs,
                       data_ova$demeaned_vdem_poly,
                       data_ova$mean_vdem_poly,
                       data_ova$demeaned_log_pop,
                       data_ova$demeaned_log_gdp_pc, 
                       data_ova$demeaned_log_gdp_pc_sq,
                       data_ova$mean_log_cc_outs,
                       data_ova$mean_log_pop,
                       data_ova$mean_log_gdp_pc,
                       data_ova$mean_log_gdp_pc_sq,
                       data_ova[,c(first_year:last_year)],
                       data_ova$demeaned_log_cc_outs * data_ova$mean_vdem_poly,
                       data_ova$mean_vdem_poly * data_ova$mean_log_cc_outs))

X <- na.omit(X)
dim(X)

## Get colnames
colnames(X) <- names(fixef(m8_wb))

#-------------------------------------------------------------------------------
# Start with set-up for retransformation
#-------------------------------------------------------------------------------
fixef(m8_wb)
ranef(m8_wb)

# Set up informal posterior of coefficients
nsim <- 1000 # number of draws

beta_hat <- c()

for(i in 1:length(fixef(m8_wb))){
  beta_hat[i] <- c(fixef(m8_wb)[[i]])
}

## Get colnames
names(beta_hat) <- names(fixef(m8_wb))

sigma_hat <- summary(m8_wb)$sigma
X_prime_X <- as.matrix(vcov(m8_wb)) / summary(m8_wb)$sigma^2 ## equivalent to vcov$unscaled

# First sigma^2
set.seed(199610)
sigma2_tilde <- rinvgamma(nsim,
                          shape = df.residual(m8_wb)/2,
                          rate = (sigma_hat^2*df.residual(m8_wb))/2)

# Now the betas
beta_tilde <- matrix(NA,
                     nrow = nsim,
                     ncol = length(beta_hat))

for(sim in 1:nsim){
  beta_tilde[sim, ] <-
    MASS::mvrnorm(1, beta_hat, X_prime_X * sigma2_tilde[sim])
}


#-------------------------------------------------------------------------------
# Set scenarios
#-------------------------------------------------------------------------------

#-------------------------------------------------------------------------------
# Determine "reasonable" variation of outsourcing
range_outsource_w <- aggregate(data_ova$demeaned_log_cc_outs,
                               by = list(data_ova$cname),
                               FUN = range)

names(range_outsource_w) <- c("country", "range")

# get the difference
range_outsource_w$diff <- range_outsource_w$range[,2]-range_outsource_w$range[,1]

# get the respective country
max_range_outsource_country_w <- range_outsource_w$country[which.max(range_outsource_w$diff)]

# and the respective values (min and max > used for setting the scenario)
min_outsource_w <- range_outsource_w$range[,1][which.max(range_outsource_w$diff)]
max_outsource_w <- range_outsource_w$range[,2][which.max(range_outsource_w$diff)]


#-------------------------------------------------------------------------------
# Determine "reasonable" variation of dem (values from binning)
min_dem <- out1$est.bin$`demeaned_log_cc_outs=-0.029 (50%)`[1,1]  
max_dem <- out1$est.bin$`demeaned_log_cc_outs=-0.029 (50%)`[3,1]

min_dem_country <- data_ova$cname[which(data_ova$mean_vdem_poly %like% min_dem)]
max_dem_country <- data_ova$cname[which(data_ova$mean_vdem_poly %like% max_dem)]


#-------------------------------------------------------------------------------
## Create empty matrix to store the scenarios
cases_w <- array(NA, 
                c(dim(X), 4))

## Copy in our X
cases_w[,,] <- X

## Add names
colnames(cases_w) <- names(beta_hat)

head(cases_w)

## Case 1: min dem & min outsource
cases_w[ ,"mean_vdem_poly", 1] <- min_dem
cases_w[ ,"demeaned_log_cc_outs", 1] <- min_outsource_w
cases_w[ ,"demeaned_log_cc_outs:mean_vdem_poly", 1] <- min_dem*min_outsource_w
cases_w[ ,"mean_vdem_poly:mean_log_cc_outs", 1] <- min_dem*mean(data_ova$mean_log_cc_outs, na.rm =T)

## Case 2: min dem & max outsource
cases_w[ ,"mean_vdem_poly", 2] <- min_dem
cases_w[ ,"demeaned_log_cc_outs", 2] <- max_outsource_w
cases_w[ ,"demeaned_log_cc_outs:mean_vdem_poly", 2] <- min_dem*max_outsource_w
cases_w[ ,"mean_vdem_poly:mean_log_cc_outs", 2] <- min_dem*mean(data_ova$mean_log_cc_outs, na.rm =T)

## Case 3: max dem & min outsource
cases_w[ ,"mean_vdem_poly", 3] <- max_dem
cases_w[ ,"demeaned_log_cc_outs", 3] <- min_outsource_w
cases_w[ ,"demeaned_log_cc_outs:mean_vdem_poly", 3] <- max_dem*min_outsource_w
cases_w[ ,"mean_vdem_poly:mean_log_cc_outs", 3] <- max_dem*mean(data_ova$mean_log_cc_outs, na.rm =T)

## Case 4: max dem & max outsource
cases_w[ ,"mean_vdem_poly", 4] <- max_dem
cases_w[ ,"demeaned_log_cc_outs", 4] <- max_outsource_w
cases_w[ ,"demeaned_log_cc_outs:mean_vdem_poly", 4] <- max_dem*max_outsource_w
cases_w[ ,"mean_vdem_poly:mean_log_cc_outs", 4] <- max_dem*mean(data_ova$mean_log_cc_outs, na.rm =T)


#-------------------------------------------------------------------------------
## Calculate linear predictor on log scale & retransform
#-------------------------------------------------------------------------------
E_Y_c_w <- matrix(NA, nrow = nsim, ncol = 4)  # empty matrix for storage

for(i in 1:4){                              # four rounds (for scenarios)
  ev <- beta_tilde %*% t(cases_w[,,i])      # calculate linear predictor on log scale
  ev_plus_gamma <- ev + 1/2*sigma2_tilde    # add draws of 1/2*sigma2_tilde
  ev_transf <- exp(ev_plus_gamma)           # transform
  ev_transf_m <- ev_transf - 1              # E(y) - 1
  tmp_val <- apply(ev_transf_m, 1, mean)    # now average results
  E_Y_c_w[,i] <- tmp_val                    # store in val object
}

## Summarize to get CIs
CI_E_Y_c_w <- apply(E_Y_c_w, 2, quantile, c(0.025, 0.5, 0.975))

## Now get the point estimates 
E_Y_c_hat_w <- matrix(NA, nrow = 4, ncol = 1)

for(i in 1:4){                                       # four rounds (for scenarios)
  ev_point <- beta_hat %*% t(cases_w[,,i])           # use beta_hat (not simulated)
  ev_point_plus_gamma <- ev_point + 1/2*sigma_hat^2  # use sigma_hat (not simulated)
  ev_point_transf <- exp(ev_point_plus_gamma)        # transform
  ev_point_transf_m <- ev_point_transf - 1           # E(y) - 1
  tmp_val_point <- apply(ev_point_transf_m, 1, mean) # now average results
  E_Y_c_hat_w[i,] <- tmp_val_point                   # store
}


#-------------------------------------------------------------------------------
# First differences
#-------------------------------------------------------------------------------

## First difference: Democracy low -> Outsource max - min
FD_dem_low_hat_w <- E_Y_c_hat_w[2,] - E_Y_c_hat_w[1,]  ## max - min
FD_dem_low_w <- E_Y_c_w[,2] - E_Y_c_w[,1] 
CI_FD_dem_low_w <- quantile(FD_dem_low_w, c(0.025, 0.975))

## First difference: Democracy high -> Outsource max - min
FD_dem_high_hat_w <- E_Y_c_hat_w[4,] - E_Y_c_hat_w[3,]  ## max - min
FD_dem_high_w <- E_Y_c_w[,4] - E_Y_c_w[,3] 
CI_FD_dem_high_w <- quantile(FD_dem_high_w, c(0.025, 0.975))

## First difference of first difference
FD_FD_hat_w <- FD_dem_high_hat_w - FD_dem_low_hat_w
FD_FD_w <- FD_dem_high_w - FD_dem_low_w
CI_FD_FD_w <- quantile(FD_FD_w, c(0.025, 0.975))



#-------------------------------------------------------------------------------
## Plot First differences
#-------------------------------------------------------------------------------
pdf("plots/FD_interaction_within.pdf")

plot(y = 1,
     x = FD_dem_low_hat_w,
     col = viridis(3)[1],
     ylim = c(0,3),
     xlim = c(-2,5.5),
     xlab = expression("CO"[2]*" emissions [metric tons/capita]"),
     pch = 19,
     main = NULL, 
     bty = "n",
     ylab = "",
     yaxt = "n",
     cex = 3,
     cex.lab = 1.1)

segments(y0 = 1, x0 = CI_FD_dem_low_w[1],
         y1 = 1, x1 = CI_FD_dem_low_w[2],
         col = viridis(3)[1],
         lwd = 12, cex = 1.5, lend = 1)

segments(y0 = 2, x0 = CI_FD_dem_high_w[1],
         y1 = 2, x1 = CI_FD_dem_high_w[2],
         col = viridis(3, alpha = 0.3)[1],
         lwd = 12, cex = 1.5, lend = 1)

points(y = 2,
       x = FD_dem_high_hat_w,
       col = viridis(3)[1],
       xlab = "",
       pch = 19,
       main = NULL, 
       bty = "n",
       ylab = "",
       yaxt = "n",
       cex = 3,
       cex.lab = 1.1)

segments(x0 = 0, y0 = 0,
         x1 = 0, y1 = 3,
         lty = "dashed", lwd = 2)

text(x = FD_dem_low_hat_w,
     y = 0.8,
     labels = c("Less democratic"),
     cex = 1.1)

text(x = FD_dem_low_hat_w,
     y = 0.57,
     labels = c("Outsourcing (max - min)"),
     cex = 0.9)

text(x = FD_dem_high_hat_w-0.35,
     y = 1.8,
     labels = c("More democratic"),
     cex = 1.1)

text(x = FD_dem_high_hat_w-0.35,
     y = 1.57,
     labels = c("Outsourcing (max - min)"),
     cex = 0.9)

dev.off()

## First difference of first difference
pdf("plots/FD_FD_within.pdf")

plot(y = 1.5,
     x = FD_FD_hat_w,
     col = viridis(3)[1],
     ylim = c(0,3),
     xlim = c(-4.5,0.75),
     xlab = expression("CO"[2]*" emissions [metric tons/capita]"),
     pch = 19,
     main = NULL, 
     bty = "n",
     ylab = "",
     yaxt = "n",
     cex = 3,
     cex.lab = 1.1)

segments(y0 = 1.5, x0 = CI_FD_FD_w[1],
         y1 = 1.5, x1 = CI_FD_FD_w[2],
         col = viridis(3)[1],
         lwd = 12, cex = 1.5, lend = 1)

segments(x0 = 0, y0 = 0,
         x1 = 0, y1 = 3,
         lty = "dashed", lwd = 2)

text(x = FD_FD_hat_w,
     y = 1.26,
     labels = c("FD of FD"),
     cex = 1.1)

text(x = FD_FD_hat_w,
     y = 1.05,
     labels = c("Democracy (more - less)"),
     cex = 0.9)

dev.off()


```

## Simulate expected values and first differences: Model group 2 (Between)

```{r expected values 2 between}

#-------------------------------------------------------------------------------
# Set scenarios
#-------------------------------------------------------------------------------

#-------------------------------------------------------------------------------
# Determine "reasonable" variation of outsourcing
min_outsource_b <- min(data_ova$mean_log_cc_outs, na.rm = T)
max_outsource_b <- max(data_ova$mean_log_cc_outs, na.rm = T)
mean_outsource_b <- mean(data_ova$mean_log_cc_outs, na.rm = T)

min_outsource_country <- data_ova$cname[which.min(data_ova$mean_log_cc_outs)]
max_outsource_country <- data_ova$cname[which.max(data_ova$mean_log_cc_outs)]

#-------------------------------------------------------------------------------
# Determine "reasonable" variation of dem
min_dem <- min(data_ova$mean_vdem_poly, na.rm = T)
max_dem <- max(data_ova$mean_vdem_poly, na.rm = T) 

min_dem_country <- data_ova$cname[which.min(data_ova$mean_vdem_poly)]
max_dem_country <- data_ova$cname[which.max(data_ova$mean_vdem_poly)]

fixef(m8_wb)
## Set scenarios
scen_1_min_min_b <- cbind(1,
                          mean(data_ova$demeaned_log_cc_outs, na.rm = T),
                          mean(data_ova$demeaned_vdem_poly, na.rm = T),
                          min_dem,
                          mean(data_ova$demeaned_log_pop, na.rm = T),
                          mean(data_ova$demeaned_log_gdp_pc, na.rm = T),
                          mean(data_ova$demeaned_log_gdp_pc_sq, na.rm = T),
                          min_outsource_b,
                          mean(data_ova$mean_log_pop, na.rm = T),
                          mean(data_ova$mean_log_gdp_pc, na.rm = T),
                          mean(data_ova$mean_log_gdp_pc_sq, na.rm = T),
                          median(data_ova$year_1991, na.rm = T),
                          median(data_ova$year_1992, na.rm = T),
                          median(data_ova$year_1993, na.rm = T),
                          median(data_ova$year_1994, na.rm = T),
                          median(data_ova$year_1995, na.rm = T),
                          median(data_ova$year_1996, na.rm = T),
                          median(data_ova$year_1997, na.rm = T),
                          median(data_ova$year_1998, na.rm = T),
                          median(data_ova$year_1999, na.rm = T),
                          median(data_ova$year_2000, na.rm = T),
                          median(data_ova$year_2001, na.rm = T),
                          median(data_ova$year_2002, na.rm = T),
                          median(data_ova$year_2003, na.rm = T),
                          median(data_ova$year_2004, na.rm = T),
                          median(data_ova$year_2005, na.rm = T),
                          median(data_ova$year_2006, na.rm = T),
                          median(data_ova$year_2007, na.rm = T),
                          median(data_ova$year_2008, na.rm = T),
                          median(data_ova$year_2009, na.rm = T),
                          median(data_ova$year_2010, na.rm = T),
                          median(data_ova$year_2011, na.rm = T),
                          median(data_ova$year_2012, na.rm = T),
                          median(data_ova$year_2013, na.rm = T),
                          median(data_ova$year_2014, na.rm = T),
                          median(data_ova$year_2015, na.rm = T),
                          mean(data_ova$demeaned_cc_outs, na.rm = T) * min_dem,
                          min_dem * min_outsource_b)

# colnames
colnames(scen_1_min_min_b) <- names(fixef(m8_wb))

# copy existing scenario1 into new object scenario2 
scen_1_min_max_b <- scen_1_min_min_b
scen_1_max_min_b <- scen_1_min_min_b
scen_1_max_max_b <- scen_1_min_min_b
scen_1_mean_min_b <- scen_1_min_min_b
scen_1_mean_max_b <- scen_1_min_min_b

# switch only the changing values to get scenario min dem, max outsourcing 
scen_1_min_max_b[, which(colnames(scen_1_min_max_b) == "mean_log_cc_outs")] <- max_outsource_b
scen_1_min_max_b[, which(colnames(scen_1_min_max_b) == "mean_vdem_poly:mean_log_cc_outs")] <- min_dem*max_outsource_b

# switch only the changing values to get scenario max dem, min outsourcing 
scen_1_max_min_b[, which(colnames(scen_1_max_min_b) == "mean_vdem_poly")] <- max_dem
scen_1_max_min_b[, which(colnames(scen_1_max_min_b) == "demeaned_log_cc_outs:mean_vdem_poly")] <- mean(data_ova$demeaned_cc_outs, na.rm = T) * max_dem
scen_1_max_min_b[, which(colnames(scen_1_max_min_b) == "mean_vdem_poly:mean_log_cc_outs")] <- max_dem*min_outsource_b

# switch only the changing values to get scenario max dem, max outsourcing 
scen_1_max_max_b[, which(colnames(scen_1_max_max_b) == "mean_vdem_poly")] <- max_dem
scen_1_max_max_b[, which(colnames(scen_1_max_max_b) == "mean_log_cc_outs")] <- max_outsource_b
scen_1_max_max_b[, which(colnames(scen_1_max_max_b) == "demeaned_log_cc_outs:mean_vdem_poly")] <- mean(data_ova$demeaned_cc_outs, na.rm = T) * max_dem
scen_1_max_max_b[, which(colnames(scen_1_max_max_b) == "mean_vdem_poly:mean_log_cc_outs")] <- max_dem*max_outsource_b

# For mean outsourcing
scen_1_mean_min_b[, which(colnames(scen_1_mean_min_b) == "mean_log_cc_outs")] <- mean_outsource_b
scen_1_mean_min_b[, which(colnames(scen_1_mean_min_b) == "mean_vdem_poly:mean_log_cc_outs")] <- min_dem*mean_outsource_b
scen_1_mean_max_b[, which(colnames(scen_1_mean_max_b) == "mean_log_cc_outs")] <- mean_outsource_b
scen_1_mean_max_b[, which(colnames(scen_1_mean_max_b) == "mean_vdem_poly")] <- max_dem
scen_1_mean_max_b[, which(colnames(scen_1_mean_max_b) == "demeaned_log_cc_outs:mean_vdem_poly")] <- mean(data_ova$demeaned_cc_outs, na.rm = T) * max_dem
scen_1_mean_max_b[, which(colnames(scen_1_mean_max_b) == "mean_vdem_poly:mean_log_cc_outs")] <- max_dem*mean_outsource_b



# min democracy (vary outsourcing from max to min)
X_b_min_dem <- rbind(scen_1_min_min_b, scen_1_min_max_b)

# max democracy (vary outsourcing from max to min)
X_b_max_dem <- rbind(scen_1_max_min_b, scen_1_max_max_b)

# mean outsourcing (vary democracy from max to min)
X_b_mean <- rbind(scen_1_mean_min_b, scen_1_mean_max_b)


# Calculate the linear predictor on log scale
X_beta_b_min_dem <- beta_tilde %*% t(X_b_min_dem)
X_beta_b_max_dem <- beta_tilde %*% t(X_b_max_dem)
X_beta_b_mean <- beta_tilde %*% t(X_b_mean)

# Now transform back to original scale

# First add draws of 1/2*sigma2_tilde to each column
X_beta_sigma_tilde_b_min_dem <- apply(X_beta_b_min_dem, 2, function(x) x + 1/2*sigma2_tilde)
X_beta_sigma_tilde_b_max_dem <- apply(X_beta_b_max_dem, 2, function(x) x + 1/2*sigma2_tilde)
X_beta_sigma_tilde_b_mean <- apply(X_beta_b_mean, 2, function(x) x + 1/2*sigma2_tilde)

# Transform
E_Y_c_b_min_dem <- exp(X_beta_sigma_tilde_b_min_dem) - 1
E_Y_c_b_max_dem <- exp(X_beta_sigma_tilde_b_max_dem) - 1
E_Y_c_b_mean <- exp(X_beta_sigma_tilde_b_mean) - 1

# Summarize to get CIs
CI_E_Y_c_b_min_dem <- apply(E_Y_c_b_min_dem, 2, quantile, c(0.025, 0.975))
CI_E_Y_c_b_max_dem <- apply(E_Y_c_b_max_dem, 2, quantile, c(0.025, 0.975))
CI_E_Y_c_b_mean <- apply(E_Y_c_b_mean, 2, quantile, c(0.025, 0.975))

# Use beta_hat and sigma_hat for point estimates
X_beta_hat_b_min_dem <- beta_hat %*% t(X_b_min_dem)
X_beta_sigma_hat_b_min_dem <- X_beta_hat_b_min_dem + 1/2*sigma_hat^2

X_beta_hat_b_max_dem <- beta_hat %*% t(X_b_max_dem)
X_beta_sigma_hat_b_max_dem <- X_beta_hat_b_max_dem + 1/2*sigma_hat^2

X_beta_hat_b_mean <- beta_hat %*% t(X_b_mean)
X_beta_sigma_hat_b_mean <- X_beta_hat_b_mean + 1/2*sigma_hat^2

# Point estimate
E_Y_c_hat_b_min_dem <- exp(X_beta_sigma_hat_b_min_dem) - 1
E_Y_c_hat_b_max_dem <- exp(X_beta_sigma_hat_b_max_dem) - 1
E_Y_c_hat_b_mean <- exp(X_beta_sigma_hat_b_mean) - 1

# First difference (min democracy (outsource from max to min))
FD_hat_b_min_dem <- E_Y_c_hat_b_min_dem[,2] - E_Y_c_hat_b_min_dem[,1]
FD_b_min_dem <- E_Y_c_b_min_dem[,2] - E_Y_c_b_min_dem[,1]
CI_FD_b_min_dem <- quantile(FD_b_min_dem, c(0.025, 0.975))

# First difference (max democracy (outsource from max to min))
FD_hat_b_max_dem <- E_Y_c_hat_b_max_dem[,2] - E_Y_c_hat_b_max_dem[,1]
FD_b_max_dem <- E_Y_c_b_max_dem[,2] - E_Y_c_b_max_dem[,1]
CI_FD_b_max_dem <- quantile(FD_b_max_dem, c(0.025, 0.975))

## First difference of first difference
FD_FD_hat_b <- FD_hat_b_max_dem - FD_hat_b_min_dem
FD_FD_b <- FD_b_max_dem - FD_b_min_dem
CI_FD_FD_b <- quantile(FD_FD_b, c(0.025, 0.975))


# First difference (outsource mean (max - min democracy))
FD_hat_b_mean <- E_Y_c_hat_b_mean[,2] - E_Y_c_hat_b_mean[,1]
FD_b_mean <- E_Y_c_b_mean[,2] - E_Y_c_b_mean[,1]
CI_FD_b_mean <- quantile(FD_b_mean, c(0.025, 0.975))


#-------------------------------------------------------------------------------
## Plot First differences
#-------------------------------------------------------------------------------
pdf("plots/FD_interaction_between.pdf")

plot(y = 1,
     x = FD_hat_b_min_dem,
     col = viridis(3)[2],
     ylim = c(0,3),
     xlim = c(-16,12),
     xlab = expression("CO"[2]*" emissions [metric tons/capita]"),
     pch = 19,
     main = NULL, 
     bty = "n",
     ylab = "",
     yaxt = "n",
     cex = 3,
     cex.lab = 1.1)

segments(y0 = 1, x0 = CI_FD_b_min_dem[1],
         y1 = 1, x1 = CI_FD_b_min_dem[2],
         col = viridis(3, alpha = 0.3)[2],
         lwd = 12, cex = 1.5, lend = 1)

segments(y0 = 2, x0 = CI_FD_b_max_dem[1],
         y1 = 2, x1 = CI_FD_b_max_dem[2],
         col = viridis(3)[2],
         lwd = 12, cex = 1.5, lend = 1)

points(y = 2,
       x = FD_hat_b_max_dem,
       col = viridis(3)[2],
       xlab = "",
       pch = 19,
       main = NULL, 
       bty = "n",
       ylab = "",
       yaxt = "n",
       cex = 3,
       cex.lab = 1.1)

segments(x0 = 0, y0 = 0,
         x1 = 0, y1 = 3,
         lty = "dashed", lwd = 2)

text(x = FD_hat_b_min_dem,
     y = 0.8,
     labels = c("Less democratic"),
     cex = 1.1)

text(x = FD_hat_b_min_dem,
     y = 0.57,
     labels = c("Outsourcing (max - min)"),
     cex = 0.9)

text(x = FD_hat_b_max_dem-0.35,
     y = 1.8,
     labels = c("More democratic"),
     cex = 1.1)

text(x = FD_hat_b_max_dem-0.35,
     y = 1.57,
     labels = c("Outsourcing (max - min)"),
     cex = 0.9)

dev.off()


## First difference of first difference
pdf("plots/FD_FD_between.pdf")

plot(y = 1.5,
     x = FD_FD_hat_b,
     col = viridis(3)[2],
     ylim = c(0,3),
     xlim = c(-16,5.5),
     xlab = expression("CO"[2]*" emissions [metric tons/capita]"),
     pch = 19,
     main = NULL, 
     bty = "n",
     ylab = "",
     yaxt = "n",
     cex = 3,
     cex.lab = 1.1)

segments(y0 = 1.5, x0 = CI_FD_FD_b[1],
         y1 = 1.5, x1 = CI_FD_FD_b[2],
         col = viridis(3, alpha = 0.3)[2],
         lwd = 12, cex = 1.5, lend = 1)

segments(x0 = 0, y0 = 0,
         x1 = 0, y1 = 3,
         lty = "dashed", lwd = 2)

text(x = FD_FD_hat_b,
     y = 1.26,
     labels = c("FD of FD"),
     cex = 1.1)

text(x = FD_FD_hat_b,
     y = 1.05,
     labels = c("Democracy (more - less)"),
     cex = 0.9)

dev.off()


#-------------------------------------------------------------------------------
# First difference at mean of outsourcing
#-------------------------------------------------------------------------------
pdf("plots/FD_mean_outsource_between.pdf")

plot(y = 1.5,
     x = FD_hat_b_mean,
     col = viridis(3)[2],
     ylim = c(0,3),
     xlim = c(-16,5.5),
     xlab = expression("CO"[2]*" emissions [metric tons/capita]"),
     pch = 19,
     main = NULL, 
     bty = "n",
     ylab = "",
     yaxt = "n",
     cex = 3,
     cex.lab = 1.1)

segments(y0 = 1.5, x0 = CI_FD_b_mean[1],
         y1 = 1.5, x1 = CI_FD_b_mean[2],
         col = viridis(3, alpha = 0.3)[2],
         lwd = 12, cex = 1.5, lend = 1)

segments(x0 = 0, y0 = 0,
         x1 = 0, y1 = 3,
         lty = "dashed", lwd = 2)

text(x = FD_hat_b_mean,
     y = 1.26,
     labels = c("First difference"),
     cex = 1.1)

text(x = FD_hat_b_mean,
     y = 1.05,
     labels = c("Democracy (more - less)"),
     cex = 0.9)

dev.off()


#-------------------------------------------------------------------------------
# OBSERVED VALUE
#-------------------------------------------------------------------------------
# ## Create empty matrix to store the scenarios
# cases_b <- array(NA, 
#                  c(dim(X), 4))
# 
# ## Copy in our X
# cases_b[,,] <- X
# 
# ## Add names
# colnames(cases_b) <- names(beta_hat)
# 
# ## Case 1: min dem & min outsource
# cases_b[ ,"mean_vdem_poly", 1] <- min_dem
# cases_b[ ,"mean_log_cc_outs", 1] <- min_outsource_b
# cases_b[ ,"mean_vdem_poly:mean_log_cc_outs", 1] <- min_dem*min_outsource_b
# 
# ## Case 2: min dem & max outsource
# cases_b[ ,"mean_vdem_poly", 2] <- min_dem
# cases_b[ ,"mean_log_cc_outs", 2] <- max_outsource_b
# cases_b[ ,"mean_vdem_poly:mean_log_cc_outs", 2] <- min_dem*max_outsource_b
# 
# ## Case 3: max dem & min outsource
# cases_b[ ,"mean_vdem_poly", 3] <- max_dem
# cases_b[ ,"mean_log_cc_outs", 3] <- min_outsource_b
# cases_b[ ,"mean_vdem_poly:mean_log_cc_outs", 3] <- max_dem*min_outsource_b
# 
# ## Case 4: max dem & max outsource
# cases_b[ ,"mean_vdem_poly", 4] <- max_dem
# cases_b[ ,"mean_log_cc_outs", 4] <- max_outsource_b
# cases_b[ ,"mean_vdem_poly:mean_log_cc_outs", 4] <- max_dem*max_outsource_b
# 
# 
# #-------------------------------------------------------------------------------
# ## Calculate linear predictor on log scale & retransform
# #-------------------------------------------------------------------------------
# E_Y_c_b <- matrix(NA, nrow = nsim, ncol = 4)  # empty matrix for storage
# 
# for(i in 1:4){                              # four rounds (for scenarios)
#   ev <- beta_tilde %*% t(cases_b[,,i])      # calculate linear predictor on log scale
#   ev_plus_gamma <- ev + 1/2*sigma2_tilde    # add draws of 1/2*sigma2_tilde
#   ev_transf <- exp(ev_plus_gamma)           # transform
#   ev_transf_m <- ev_transf - 1              # E(y) - 1
#   tmp_val <- apply(ev_transf_m, 1, mean)    # now average results
#   E_Y_c_b[,i] <- tmp_val                    # store in val object
# }
# 
# ## Summarize to get CIs
# CI_E_Y_c_b <- apply(E_Y_c_b, 2, quantile, c(0.025, 0.5, 0.975))
# 
# ## Now get the point estimates 
# E_Y_c_hat_b <- matrix(NA, nrow = 4, ncol = 1)
# 
# for(i in 1:4){                                       # four rounds (for scenarios)
#   ev_point <- beta_hat %*% t(cases_b[,,i])           # use beta_hat (not simulated)
#   ev_point_plus_gamma <- ev_point + 1/2*sigma_hat^2  # use sigma_hat (not simulated)
#   ev_point_transf <- exp(ev_point_plus_gamma)        # transform
#   ev_point_transf_m <- ev_point_transf - 1           # E(y) - 1
#   tmp_val_point <- apply(ev_point_transf_m, 1, mean) # now average results
#   E_Y_c_hat_b[i,] <- tmp_val_point                   # store
# }
# 
# #-------------------------------------------------------------------------------
# # First differences
# #-------------------------------------------------------------------------------
# 
# ## First difference: Democracy low -> Outsource max - min
# FD_dem_low_hat_b <- E_Y_c_hat_b[2,] - E_Y_c_hat_b[1,]  ## max - min
# FD_dem_low_b <- E_Y_c_b[,2] - E_Y_c_b[,1] 
# CI_FD_dem_low_b <- quantile(FD_dem_low_b, c(0.025, 0.975))
# 
# ## First difference: Democracy high -> Outsource max - min
# FD_dem_high_hat_b <- E_Y_c_hat_b[4,] - E_Y_c_hat_b[3,]  ## max - min
# FD_dem_high_b <- E_Y_c_b[,4] - E_Y_c_b[,3] 
# CI_FD_dem_high_b <- quantile(FD_dem_high_b, c(0.025, 0.975))
# 
# ## First difference of first difference
# FD_FD_hat_b2 <- FD_dem_high_hat_b - FD_dem_low_hat_b
# FD_FD_b2 <- FD_dem_high_b - FD_dem_low_b
# CI_FD_FD_b2 <- quantile(FD_FD_b2, c(0.025, 0.975))


#-------------------------------------------------------------------------------
## How "big" is the difference compared to variation in the dataset?
#-------------------------------------------------------------------------------
sd_co2 <- aggregate(data_ova$wdi_co2,
                    by = list(data_ova$cname),
                    FUN = sd)

mean_within_country_sd_co2 <- mean(sd_co2[,2], na.rm = T)
mean_between_country_sd_co2 <- sd(data_ova$mean_wdi, na.rm = T)
share_FD_var_w_co2 <- abs(FD_FD_hat_w) / mean_within_country_sd_co2
share_FD_var_b_co2 <- abs(FD_FD_hat_b) / mean_between_country_sd_co2

```




# Robustness

## Replacing DV with PM2.5: ROBUST

```{r replace DV pm25}

#-------------------------------------------------------------------------------
# Model group 2 
#-------------------------------------------------------------------------------
m8_wb_pm25 <- lmer(log_pm25_pop ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                    demeaned_log_cc_outs * mean_vdem_poly +
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    mean_log_cc_outs + mean_vdem_poly +
                    interaction_mean +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    as.factor(year) + (1 | ccode),
                   data = data_ova)

stargazer(m8_wb, m8_wb_pm25, type = "text", omit = c("ccode", "year"))

```

## Replacing DV with carbon footprint: NOT ROBUST

```{r replace DV carbon footprint}

#-------------------------------------------------------------------------------
# Model group 2 
#-------------------------------------------------------------------------------
m8_wb_cf <- lmer(ef_carb ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                    demeaned_log_cc_outs * mean_vdem_poly +
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    mean_log_cc_outs + mean_vdem_poly +
                    interaction_mean +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    as.factor(year) + (1 | ccode),
                   data = data_ova)

stargazer(m8_wb, m8_wb_cf, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
# Output
#-------------------------------------------------------------------------------
texreg(list(m8_wb, m8_wb_pm25, m8_wb_cf),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg2.doc",
       caption = "Alternative Dependent Variables",
       custom.model.names = c("Model 7", "Model A1-PM2.5", "Model A2-Carbon Footprint"),
       label = "tab:table3",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_vdem_poly" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_cc_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "interaction_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "mean_vdem_poly:mean_log_cc_outs" = "Pollution Outsourcing x Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}")))

```

## Different outsourcing variables: NOT ROBUST (Between for land use & energy)

```{r diff outsourcing}

#-------------------------------------------------------------------------------
# Model group 2 
#-------------------------------------------------------------------------------
m8_wb_bw <- lmer(log_wdi_co2 ~ demeaned_log_bw_outs + demeaned_vdem_poly + 
                     demeaned_log_bw_outs * mean_vdem_poly +
                     demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                     mean_log_cc_outs + mean_vdem_poly +
                     interaction_bw_mean +
                     mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                     as.factor(year) + (1 | ccode),
                    data = data_ova)

m8_wb_en <- lmer(log_wdi_co2 ~ demeaned_log_en_outs + demeaned_vdem_poly + 
                     demeaned_log_en_outs * mean_vdem_poly +
                     demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                     mean_log_cc_outs + mean_vdem_poly +
                     interaction_en_mean +
                     mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                     as.factor(year) + (1 | ccode),
                    data = data_ova)

m8_wb_lu <- lmer(log_wdi_co2 ~ demeaned_log_lu_outs + demeaned_vdem_poly + 
                     demeaned_log_lu_outs * mean_vdem_poly +
                     demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                     mean_log_cc_outs + mean_vdem_poly +
                     interaction_lu_mean +
                     mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                     as.factor(year) + (1 | ccode),
                    data = data_ova)

m8_wb_mf <- lmer(log_wdi_co2 ~ demeaned_log_mf_outs + demeaned_vdem_poly + 
                     demeaned_log_mf_outs * mean_vdem_poly +
                     demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                     mean_log_cc_outs + mean_vdem_poly +
                     interaction_mf_mean +
                     mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                     as.factor(year) + (1 | ccode),
                    data = data_ova)

stargazer(m8_wb, m8_wb_bw, m8_wb_en, m8_wb_lu, m8_wb_mf, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
# Output
#-------------------------------------------------------------------------------
texreg(list(m8_wb, m8_wb_bw, m8_wb_en, m8_wb_lu, m8_wb_mf),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg2.doc",
       caption = "Alternative Outsourcing Variables",
       custom.model.names = c("Model 7", "Model A3-bluewater", "Model A4-energy", "Model A5-landuse", "Model A6-material"),
       label = "tab:table4",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "demeaned_log_bw_outs" = "Pollution Outsourcing (Within)",
                              "demeaned_log_en_outs" = "Pollution Outsourcing (Within)",
                              "demeaned_log_lu_outs" = "Pollution Outsourcing (Within)",
                              "demeaned_log_mf_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "mean_log_bw_outs" = "Pollution Outsourcing (Between)",
                              "mean_log_en_outs" = "Pollution Outsourcing (Between)",
                              "mean_log_lu_outs" = "Pollution Outsourcing (Between)",
                              "mean_log_mf_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_vdem_poly" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_cc_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "demeaned_log_bw_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "demeaned_log_en_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "demeaned_log_lu_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "demeaned_log_mf_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "mean_vdem_poly:mean_log_cc_outs" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_bw_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_en_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_lu_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_mf_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\ding{51}", "\\ding{51}", "\\ding{51}","\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}")))

```

## Alternative operationalization outsourcing: Total sum of pollution outsourced

```{r total sum}

#-------------------------------------------------------------------------------
# Model group 1
#-------------------------------------------------------------------------------
m1_wb_sum <- lmer(log_cc_sum_outs ~ demeaned_vdem_poly + mean_vdem_poly + as.factor(year) +
                   (1 | ccode), 
                  data = data_ova)

m2_wb_sum <- lmer(log_cc_sum_outs ~ demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq + 
                   mean_log_pop + mean_log_gdp_pc + mean_log_gdp_pc_sq +
                   as.factor(year) + (1 | ccode),
                  data = data_ova)

m3_wb_sum <- lmer(log_cc_sum_outs ~ demeaned_vdem_poly + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq + 
                   demeaned_log_pop + 
                   mean_vdem_poly + mean_log_gdp_pc + mean_log_gdp_pc_sq + mean_log_pop + as.factor(year) +
                   (1 | ccode), 
                  data = data_ova)

stargazer(m1_wb_sum, m2_wb_sum, m3_wb_sum, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
## Output
#-------------------------------------------------------------------------------
texreg(list(m1_wb_sum, m2_wb_sum, m3_wb_sum),
       stars = c(0.01, 0.05, 0.1),
       #file = "texreg1.doc",
       caption = "Total Sum of Pollution Outsourced - Analysis 1",
       custom.model.names = c("Model A7", "Model A8", "Model A9"),
       label = "tab:table5",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_vdem_poly" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)", 
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)", 
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\xmark", "\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}")))


#-------------------------------------------------------------------------------
# Model group 2
#-------------------------------------------------------------------------------
m5_wb_sum <- lmer(log_wdi_co2 ~ demeaned_log_cc_sum_outs + demeaned_vdem_poly + 
                   mean_log_cc_sum_outs + mean_vdem_poly +
                   as.factor(year) + (1 | ccode),
                  data = data_ova)

m6_wb_sum <- lmer(log_wdi_co2 ~ demeaned_log_cc_sum_outs + demeaned_vdem_poly +
                   demeaned_log_cc_sum_outs * mean_vdem_poly + 
                   mean_log_cc_sum_outs + mean_vdem_poly +
                   interaction_cc_sum_mean +
                   as.factor(year) + (1 | ccode),
                  data = data_ova)

m8_wb_sum <- lmer(log_wdi_co2 ~ demeaned_log_cc_sum_outs + demeaned_vdem_poly + 
                   demeaned_log_cc_sum_outs * mean_vdem_poly +
                   demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                   mean_log_cc_sum_outs + mean_vdem_poly +
                   interaction_cc_sum_mean +
                   mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                   as.factor(year) + (1 | ccode),
                  data = data_ova)

stargazer(m5_wb_sum, m6_wb_sum, m7_wb, m8_wb_sum, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
# Output
#-------------------------------------------------------------------------------
texreg(list(m5_wb_sum, m6_wb_sum, m7_wb, m8_wb_sum),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg2.doc",
       caption = "Total Sum of Pollution Outsourced - Analysis 2",
       custom.model.names = c("Model A10", "Model A11", "Model A12", "Model A13"),
       label = "tab:table6",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_sum_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_sum_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_vdem_poly" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_cc_sum_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "interaction_cc_sum_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\xmark", "\\xmark", "\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}")))


```

## Alternative operationalization for democracy: ROBUST

```{r diff  dem}

#-------------------------------------------------------------------------------
# Model group 2 
#-------------------------------------------------------------------------------
m8_wb_polity <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_polity + 
                      demeaned_log_cc_outs * mean_polity +
                      demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                      mean_log_cc_outs + mean_polity +
                      interaction_polity_mean +
                      mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                      as.factor(year) + (1 | ccode),
                     data = data_ova)

stargazer(m8_wb, m8_wb_polity, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
# Output
#-------------------------------------------------------------------------------
texreg(list(m8_wb, m8_wb_polity),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg2.doc",
       digits = 3,
       caption = "Alternative Operationalization for Democracy",
       custom.model.names = c("Model 7-Vdem", "Model A14-Polity"),
       label = "tab:table6",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_vdem_poly" = "Democracy (Within)",
                              "demeaned_polity" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "mean_polity" = "Democracy (Between)",
                              "demeaned_log_cc_outs:mean_polity" = "Pollution Outsourcing x Democracy (Within)",
                              "demeaned_log_cc_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "interaction_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "mean_vdem_poly:mean_log_cc_outs" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_polity_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}")))

```

## Additional controls: Political globaliz. & env. treaties: NOT ROBUST (BETWEEN)

```{r additional controls}

#-------------------------------------------------------------------------------
# Model group 2 
#-------------------------------------------------------------------------------
m8_wb_ac1 <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                    demeaned_log_cc_outs * mean_vdem_poly +
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    demeaned_trade + 
                    mean_log_cc_outs + mean_vdem_poly +
                    interaction_mean +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    mean_trade +
                    as.factor(year) + (1 | ccode),
                   data = data_ova)

m8_wb_ac2 <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                    demeaned_log_cc_outs * mean_vdem_poly +
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                   # demeaned_env_treat + (only between because variation only between-countries)
                    mean_log_cc_outs + mean_vdem_poly +
                    interaction_mean +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    mean_env_treat +
                    as.factor(year) + (1 | ccode),
                   data = data_ova)

m8_wb_ac3 <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                    demeaned_log_cc_outs * mean_vdem_poly +
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    demeaned_dr_pg + 
                    mean_log_cc_outs + mean_vdem_poly +
                    interaction_mean +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    mean_dr_pg +
                    as.factor(year) + (1 | ccode),
                   data = data_ova)

m8_wb_ac4 <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                    demeaned_log_cc_outs * mean_vdem_poly +
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    demeaned_gdp + 
                    mean_log_cc_outs + mean_vdem_poly +
                    interaction_mean +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    mean_gdp +
                    as.factor(year) + (1 | ccode),
                   data = data_ova)

m8_wb_ac_all <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                      demeaned_log_cc_outs * mean_vdem_poly +
                      demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                      demeaned_trade + demeaned_dr_pg + demeaned_gdp + 
                      mean_log_cc_outs + mean_vdem_poly +
                      interaction_mean +
                      mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                      mean_trade + mean_env_treat + mean_dr_pg + mean_gdp +
                      as.factor(year) + (1 | ccode),
                    data = data_ova)


stargazer(m8_wb, m8_wb_ac1, m8_wb_ac2, m8_wb_ac3, m8_wb_ac4, m8_wb_ac_all, type = "text", omit = c("ccode", "year"))


texreg(list(m8_wb, m8_wb_ac1, m8_wb_ac2, m8_wb_ac3, m8_wb_ac4, m8_wb_ac_all),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg2.doc",
       caption = "Additional Control Variables",
       custom.model.names = c("Model 7", "Model A15", "Model A16", "Model A17", "Model A18", "Model A19"),
       label = "tab:table7",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_polity" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_cc_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "mean_vdem_poly:mean_log_cc_outs" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "mean_vdem_poly:mean_log_cc_outs" = "Pollution Outsourcing x Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "demeaned_trade" = "Trade Openness (Within)",
                              "mean_trade" = "Trade Openness (Between)",
                              "mean_env_treat" = "Environmental Treaties (Between)",
                              "demeaned_dr_pg" = "Political Globalization (Within)",
                              "mean_dr_pg" = "Political Globalization (Between)",
                              "demeaned_gdp" = "GDP (Within)",
                              "mean_gdp" = "GDP (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}", "\\ding{51}")))


```

## Drop high-income states: ROBUST

```{r drop high-income}

#-------------------------------------------------------------------------------
# Model group 2 
#-------------------------------------------------------------------------------
m8_wb_low_income <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + demeaned_vdem_poly + 
                          demeaned_log_cc_outs * mean_vdem_poly +
                          demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                          mean_log_cc_outs + mean_vdem_poly +
                          interaction_mean +
                          mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                          as.factor(year) + (1 | ccode),
                         data = data_ova[data_ova$log_gdp_pc < 10.1,])

stargazer(m8_wb, m8_wb_low_income, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
# Output
#-------------------------------------------------------------------------------
texreg(list(m8_wb, m8_wb_low_income),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg2.doc",
       caption = "Exclusion of the Richest States",
       custom.model.names = c("Model 7", "Model A20"),
       label = "tab:table9",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_polity" = "Democracy (Within)",
                              "mean_vdem_poly" = "Democracy (Between)",
                              "demeaned_log_cc_outs:mean_vdem_poly" = "Pollution Outsourcing x Democracy (Within)",
                              "mean_vdem_poly:mean_log_cc_outs" = "Pollution Outsourcing x Democracy (Between)",
                              "interaction_mean" = "Pollution Outsourcing x Democracy (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}")))
```

## Sample split instead of interaction: ROBUST

```{r sample split}

#-------------------------------------------------------------------------------
# Model group 2 - Split by median of democracy variable (within effects)
#-------------------------------------------------------------------------------
data_dem_low <- data_ova[which(data_ova$vdem_polyarchy < median(data_ova$vdem_polyarchy, na.rm = T)),]
data_dem_high <- data_ova[which(data_ova$vdem_polyarchy >= median(data_ova$vdem_polyarchy, na.rm = T)),]

m8_dem_low <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + #demeaned_vdem_poly + 
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    mean_log_cc_outs + #mean_vdem_poly +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    as.factor(year) + (1 | ccode),
                   data = data_dem_low)

m8_dem_high <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + #demeaned_vdem_poly + 
                     demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                     mean_log_cc_outs + #mean_vdem_poly +
                     mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                     as.factor(year) + (1 | ccode),
                    data = data_dem_high)

stargazer(m8_dem_low, m8_dem_high, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
# Model group 2 - Split by median of mean democracy variable (between effects)
#-------------------------------------------------------------------------------
data_dem_low2 <- data_ova[which(data_ova$mean_vdem_poly < median(data_ova$mean_vdem_poly, na.rm = T)),]
data_dem_high2 <- data_ova[which(data_ova$mean_vdem_poly >= median(data_ova$mean_vdem_poly, na.rm = T)),]

m9_dem_low <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + #demeaned_vdem_poly + 
                    demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                    mean_log_cc_outs + #mean_vdem_poly +
                    mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                    as.factor(year) + (1 | ccode),
                   data = data_dem_low2)

m9_dem_high <- lmer(log_wdi_co2 ~ demeaned_log_cc_outs + #demeaned_vdem_poly + 
                     demeaned_log_pop + demeaned_log_gdp_pc + demeaned_log_gdp_pc_sq +
                     mean_log_cc_outs + #mean_vdem_poly +
                     mean_log_pop +  mean_log_gdp_pc + mean_log_gdp_pc_sq +
                     as.factor(year) + (1 | ccode),
                    data = data_dem_high2)

stargazer(m9_dem_low, m9_dem_high, type = "text", omit = c("ccode", "year"))


#-------------------------------------------------------------------------------
# Output
#-------------------------------------------------------------------------------
texreg(list(m8_dem_low, m8_dem_high),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg3.doc",
       caption = "Alternative Model Specification - Sample Split (Within)",
       custom.model.names = c("Model A21-Low dem.", "Model A21-High dem."),
       label = "tab:table10",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}")))

texreg(list(m9_dem_low, m9_dem_high),
       stars = c(0.01, 0.05, 0.1),
      # file = "texreg4.doc",
       caption = "Alternative Model Specification - Sample Split (Between)",
       custom.model.names = c("Model A22Low dem.", "Model A22-High dem."),
       label = "tab:table11",
       caption.above = T,
       fontsize = "scriptsize",
       custom.coef.map = list("demeaned_log_cc_outs" = "Pollution Outsourcing (Within)",
                              "mean_log_cc_outs" = "Pollution Outsourcing (Between)",
                              "demeaned_log_pop" = "Population (Within)",
                              "mean_log_pop" = "Population (Between)",
                              "demeaned_log_gdp_pc" = "GDP per capita (Within)",
                              "demeaned_log_gdp_pc_sq" = "GDP per capita$^2$ (Within)",
                              "mean_log_gdp_pc" = "GDP per capita (Between)",
                              "mean_log_gdp_pc_sq" = "GDP per capita$^2$ (Between)",
                              "(Intercept)" = "Constant"),
        custom.gof.rows = list("Controls" = c("\\ding{51}", "\\ding{51}"),
                               "Country-FE" = c("\\ding{51}", "\\ding{51}"),
                               "Year-FE" = c("\\ding{51}", "\\ding{51}")))

```

